From 46909223565fc58edd901d21b48a87ce8ccf08fa Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Sun, 7 Dec 2003 11:15:16 +0000 Subject: [PATCH] Backport various tutorial fixups (with permission from the RM). --- Doc/tut/tut.tex | 189 +++++++++++++++++++++++++++--------------------- 1 file changed, 107 insertions(+), 82 deletions(-) diff --git a/Doc/tut/tut.tex b/Doc/tut/tut.tex index 6d7e7f43fd5a..77822c44bbef 100644 --- a/Doc/tut/tut.tex +++ b/Doc/tut/tut.tex @@ -2,9 +2,6 @@ \usepackage[T1]{fontenc} % Things to do: -% Add a section on file I/O -% Write a chapter entitled ``Some Useful Modules'' -% --re, math+cmath % Should really move the Python startup file info to an appendix \title{Python Tutorial} @@ -331,19 +328,19 @@ With that declaration, all characters in the source file will be treated as possible to directly write Unicode string literals in the selected encoding. The list of possible encodings can be found in the \citetitle[../lib/lib.html]{Python Library Reference}, in the section -on \module{codecs}. +on \ulink{\module{codecs}}{../lib/module-codecs.html}. -If your editor supports saving files as \code{UTF-8} with an UTF-8 -signature (aka BOM -- Byte Order Mark), you can use that instead of an +If your editor supports saving files as \code{UTF-8} with a UTF-8 +\emph{byte order mark} (aka BOM), you can use that instead of an encoding declaration. IDLE supports this capability if \code{Options/General/Default Source Encoding/UTF-8} is set. Notice that this signature is not understood in older Python releases (2.2 and earlier), and also not understood by the operating system for -\code{\#!} files. +\code{\#!} files. By using UTF-8 (either through the signature or an encoding declaration), characters of most languages in the world can be used -simultaneously in string literals and comments. Using non-ASCII +simultaneously in string literals and comments. Using non-\ASCII characters in identifiers is not supported. To display all these characters properly, your editor must recognize that the file is UTF-8, and it must use a font that supports all the characters in the @@ -659,15 +656,14 @@ the first line above could also have been written \samp{word = 'Help' expressions: \begin{verbatim} ->>> import string >>> 'str' 'ing' # <- This is ok 'string' ->>> string.strip('str') + 'ing' # <- This is ok +>>> 'str'.strip() + 'ing' # <- This is ok 'string' ->>> string.strip('str') 'ing' # <- This is invalid +>>> 'str'.strip() 'ing' # <- This is invalid File "", line 1, in ? - string.strip('str') 'ing' - ^ + 'str'.strip() 'ing' + ^ SyntaxError: invalid syntax \end{verbatim} @@ -809,6 +805,21 @@ The built-in function \function{len()} returns the length of a string: \end{verbatim} +\begin{seealso} + \seetitle[../lib/typesseq.html]{Sequence Types}% + {Strings, and the Unicode strings described in the next + section, are examples of \emph{sequence types}, and + support the common operations supported by such types.} + \seetitle[../lib/string-methods.html]{String Methods}% + {Both strings and Unicode strings support a large number of + methods for basic transformations and searching.} + \seetitle[../lib/typesseq-strings.html]{String Formatting Operations}% + {The formatting operations invoked when strings and Unicode + strings are the left operand of the \code{\%} operator are + described in more detail here.} +\end{seealso} + + \subsection{Unicode Strings \label{unicodeStrings}} \sectionauthor{Marc-Andre Lemburg}{mal@lemburg.com} @@ -881,7 +892,7 @@ the more well known encodings which these codecs can convert are \emph{Latin-1}, \emph{ASCII}, \emph{UTF-8}, and \emph{UTF-16}. The latter two are variable-length encodings that store each Unicode character in one or more bytes. The default encoding is -normally set to ASCII, which passes through characters in the range +normally set to \ASCII, which passes through characters in the range 0 to 127 and rejects any other characters with an error. When a Unicode string is printed, written to a file, or converted with \function{str()}, conversion takes place using this default encoding. @@ -1518,7 +1529,7 @@ TypeError: function() got multiple values for keyword argument 'a' \end{verbatim} When a final formal parameter of the form \code{**\var{name}} is -present, it receives a dictionary containing all keyword arguments +present, it receives a \ulink{dictionary}{../lib/typesmapping.html} containing all keyword arguments whose keyword doesn't correspond to a formal parameter. This may be combined with a formal parameter of the form \code{*\var{name}} (described in the next subsection) which receives a @@ -1838,19 +1849,14 @@ cubes: More than one sequence may be passed; the function must then have as many arguments as there are sequences and is called with the corresponding item from each sequence (or \code{None} if some sequence -is shorter than another). If \code{None} is passed for the function, -a function returning its argument(s) is substituted. - -Combining these two special cases, we see that -\samp{map(None, \var{list1}, \var{list2})} is a convenient way of -turning a pair of lists into a list of pairs. For example: +is shorter than another). For example: \begin{verbatim} >>> seq = range(8) ->>> def square(x): return x*x +>>> def add(x, y): return x+y ... ->>> map(None, seq, map(square, seq)) -[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)] +>>> map(add, seq, seq) +[0, 2, 4, 6, 8, 10, 12, 14] \end{verbatim} \samp{reduce(\var{func}, \var{sequence})} returns a single value @@ -1973,9 +1979,10 @@ another value is assigned to it). We'll find other uses for We saw that lists and strings have many common properties, such as indexing and slicing operations. They are two examples of -\emph{sequence} data types. Since Python is an evolving language, -other sequence data types may be added. There is also another -standard sequence data type: the \emph{tuple}. +\ulink{\emph{sequence} data types}{../lib/typesseq.html}. Since +Python is an evolving language, other sequence data types may be +added. There is also another standard sequence data type: the +\emph{tuple}. A tuple consists of a number of values separated by commas, for instance: @@ -2045,7 +2052,8 @@ always creates a tuple, and unpacking works for any sequence. \section{Dictionaries \label{dictionaries}} -Another useful data type built into Python is the \emph{dictionary}. +Another useful data type built into Python is the +\ulink{\emph{dictionary}}{../lib/typesmapping.html}. Dictionaries are sometimes found in other languages as ``associative memories'' or ``associative arrays''. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by \emph{keys}, @@ -2073,11 +2081,11 @@ If you store using a key that is already in use, the old value associated with that key is forgotten. It is an error to extract a value using a non-existent key. -The \code{keys()} method of a dictionary object returns a list of all +The \method{keys()} method of a dictionary object returns a list of all the keys used in the dictionary, in random order (if you want it -sorted, just apply the \code{sort()} method to the list of keys). To +sorted, just apply the \method{sort()} method to the list of keys). To check whether a single key is in the dictionary, use the -\code{has_key()} method of the dictionary. +\method{has_key()} method of the dictionary. Here is a small example using a dictionary: @@ -2151,6 +2159,21 @@ What is your quest? It is the holy grail. What is your favorite color? It is blue. \end{verbatim} +To loop over a sequence in reverse, first specify the sequence +in a forward direction and then call the \function{reversed()} +function. + +\begin{verbatim} +>>> for i in reversed(xrange(1,10,2)): +... print i +... +9 +7 +5 +3 +1 +\end{verbatim} + \section{More on Conditions \label{conditions}} @@ -2388,7 +2411,7 @@ script being run is on the search path, it is important that the script not have the same name as a standard module, or Python will attempt to load the script as a module when that module is imported. This will generally be an error. See section~\ref{standardModules}, -``Standard Modules.'' for more information. +``Standard Modules,'' for more information. \subsection{``Compiled'' Python files} @@ -2454,9 +2477,10 @@ library of Python code in a form that is moderately hard to reverse engineer. \item -The module \module{compileall}\refstmodindex{compileall} can create -\file{.pyc} files (or \file{.pyo} files when \programopt{-O} is used) for -all modules in a directory. +The module \ulink{\module{compileall}}{../lib/module-compileall.html}% +{} \refstmodindex{compileall} can create \file{.pyc} files (or +\file{.pyo} files when \programopt{-O} is used) for all modules in a +directory. \end{itemize} @@ -2473,7 +2497,8 @@ system calls. The set of such modules is a configuration option which also depends on the underlying platform For example, the \module{amoeba} module is only provided on systems that somehow support Amoeba primitives. One particular module deserves some -attention: \module{sys}\refstmodindex{sys}, which is built into every +attention: \ulink{\module{sys}}{../lib/module-sys.html}% +\refstmodindex{sys}, which is built into every Python interpreter. The variables \code{sys.ps1} and \code{sys.ps2} define the strings used as primary and secondary prompts: @@ -2756,14 +2781,15 @@ submodules with the same name from different packages. \subsection{Intra-package References} The submodules often need to refer to each other. For example, the -\module{surround} module might use the \module{echo} module. In fact, such references -are so common that the \code{import} statement first looks in the +\module{surround} module might use the \module{echo} module. In fact, +such references +are so common that the \keyword{import} statement first looks in the containing package before looking in the standard module search path. Thus, the surround module can simply use \code{import echo} or \code{from echo import echofilter}. If the imported module is not found in the current package (the package of which the current module -is a submodule), the \code{import} statement looks for a top-level module -with the given name. +is a submodule), the \keyword{import} statement looks for a top-level +module with the given name. When packages are structured into subpackages (as with the \module{Sound} package in the example), there's no shortcut to refer @@ -2773,15 +2799,6 @@ must be used. For example, if the module in the \module{Sound.Effects} package, it can use \code{from Sound.Effects import echo}. -%(One could design a notation to refer to parent packages, similar to -%the use of ".." to refer to the parent directory in \UNIX{} and Windows -%filesystems. In fact, the \module{ni} module, which was the -%ancestor of this package system, supported this using \code{__} for -%the package containing the current module, -%\code{__.__} for the parent package, and so on. This feature was dropped -%because of its awkwardness; since most packages will have a relative -%shallow substructure, this is no big loss.) - \subsection{Packages in Multiple Directories} Packages support one more special attribute, \member{__path__}. This @@ -2873,11 +2890,10 @@ The value of x is 32.5, and y is 40000... Here are two ways to write a table of squares and cubes: \begin{verbatim} ->>> import string >>> for x in range(1, 11): -... print string.rjust(repr(x), 2), string.rjust(repr(x*x), 3), +... print repr(x).rjust(2), repr(x*x).rjust(3), ... # Note trailing comma on previous line -... print string.rjust(repr(x*x*x), 4) +... print repr(x*x*x).rjust(4) ... 1 1 1 2 4 8 @@ -2907,28 +2923,27 @@ Here are two ways to write a table of squares and cubes: (Note that one space between each column was added by the way \keyword{print} works: it always adds spaces between its arguments.) -This example demonstrates the function \function{string.rjust()}, +This example demonstrates the \method{rjust()} method of string objects, which right-justifies a string in a field of a given width by padding -it with spaces on the left. There are similar functions -\function{string.ljust()} and \function{string.center()}. These -functions do not write anything, they just return a new string. If +it with spaces on the left. There are similar methods +\method{ljust()} and \method{center()}. These +methods do not write anything, they just return a new string. If the input string is too long, they don't truncate it, but return it unchanged; this will mess up your column lay-out but that's usually better than the alternative, which would be lying about a value. (If you really want truncation you can always add a slice operation, as in -\samp{string.ljust(x,~n)[0:n]}.) +\samp{x.ljust(~n)[:n]}.) -There is another function, \function{string.zfill()}, which pads a +There is another method, \method{zfill()}, which pads a numeric string on the left with zeros. It understands about plus and minus signs: \begin{verbatim} ->>> import string ->>> string.zfill('12', 5) +>>> '12'.zfill(5) '00012' ->>> string.zfill('-3.14', 7) +>>> '-3.14'.zfill(7) '-003.14' ->>> string.zfill('3.14159265359', 5) +>>> '3.14159265359'.zfill(5) '3.14159265359' \end{verbatim} @@ -3111,14 +3126,15 @@ objects. Strings can easily be written to and read from a file. Numbers take a bit more effort, since the \method{read()} method only returns strings, which will have to be passed to a function like -\function{string.atoi()}, which takes a string like \code{'123'} and +\function{int()}, which takes a string like \code{'123'} and returns its numeric value 123. However, when you want to save more complex data types like lists, dictionaries, or class instances, things get a lot more complicated. Rather than have users be constantly writing and debugging code to save complicated data types, Python provides a standard module called -\module{pickle}. This is an amazing module that can take almost +\ulink{\module{pickle}}{../lib/module-pickle.html}. This is an +amazing module that can take almost any Python object (even some forms of Python code!), and convert it to a string representation; this process is called \dfn{pickling}. Reconstructing the object from the string representation is called @@ -3143,12 +3159,15 @@ x = pickle.load(f) (There are other variants of this, used when pickling many objects or when you don't want to write the pickled data to a file; consult the -complete documentation for \module{pickle} in the Library Reference.) - -\module{pickle} is the standard way to make Python objects which can -be stored and reused by other programs or by a future invocation of -the same program; the technical term for this is a -\dfn{persistent} object. Because \module{pickle} is so widely used, +complete documentation for +\ulink{\module{pickle}}{../lib/module-pickle.html} in the +\citetitle[../lib/]{Python Library Reference}.) + +\ulink{\module{pickle}}{../lib/module-pickle.html} is the standard way +to make Python objects which can be stored and reused by other +programs or by a future invocation of the same program; the technical +term for this is a \dfn{persistent} object. Because +\ulink{\module{pickle}}{../lib/module-pickle.html} is so widely used, many authors who write Python extensions take care to ensure that new data types such as matrices can be properly pickled and unpickled. @@ -3294,12 +3313,12 @@ error message and then re-raise the exception (allowing a caller to handle the exception as well): \begin{verbatim} -import string, sys +import sys try: f = open('myfile.txt') s = f.readline() - i = int(string.strip(s)) + i = int(s.strip()) except IOError, (errno, strerror): print "I/O error(%s): %s" % (errno, strerror) except ValueError: @@ -4338,7 +4357,7 @@ operating system: \begin{verbatim} >>> import os ->>> os.system('copy /data/mydata.fil /backup/mydata.fil') +>>> os.system('time 0:02') 0 >>> os.getcwd() # Return the current working directory 'C:\\Python24' @@ -4425,9 +4444,7 @@ The most direct way to terminate a script is to use \samp{sys.exit()}. The \ulink{\module{re}}{../lib/module-re.html} module provides regular expression tools for advanced string processing. -When only simple capabilities are needed, string methods are preferred -because they are easier to read and debug. However, for more -sophisticated applications, regular expressions can provide succinct, +For complex matching and manipulation, regular expressions offer succinct, optimized solutions: \begin{verbatim} @@ -4438,6 +4455,13 @@ optimized solutions: 'cat in the hat' \end{verbatim} +When only simple capabilities are needed, string methods are preferred +because they are easier to read and debug: + +\begin{verbatim} +>>> 'tea for too'.replace('too', 'two') +'tea for two' +\end{verbatim} \section{Mathematics\label{mathematics}} @@ -4676,8 +4700,8 @@ than the main site, depending on your geographical location. A more informal site is \url{http://starship.python.net/}, which contains a bunch of Python-related personal home pages; many people have downloadable software there. Many more user-created Python modules -can be found in a third-party repository at -\url{http://www.vex.net/parnassus}. +can be found in the \ulink{Python Package +Index}{http://www.python.org/pypi} (PyPI). For Python-related questions and problem reports, you can post to the newsgroup \newsgroup{comp.lang.python}, or send them to the mailing @@ -4690,8 +4714,7 @@ up to several hundred), % days = 116.9 msgs / day and steadily increasing. asking (and answering) questions, suggesting new features, and announcing new modules. Before posting, be sure to check the list of -Frequently Asked Questions (also called the FAQ), at -\url{http://www.python.org/doc/FAQ.html}, or look for it in the +\ulink{Frequently Asked Questions}{http://www.python.org/doc/faq/} (also called the FAQ), or look for it in the \file{Misc/} directory of the Python source distribution. Mailing list archives are available at \url{http://www.python.org/pipermail/}. The FAQ answers many of the questions that come up again and again, @@ -4789,7 +4812,8 @@ Tab: complete \end{verbatim} in your \file{\~{}/.inputrc}. (Of course, this makes it harder to -type indented continuation lines.) +type indented continuation lines if you're accustomed to using +\kbd{Tab} for that purpose.) Automatic completion of variable and module names is optionally available. To enable it in the interpreter's interactive mode, add @@ -4818,7 +4842,8 @@ this deletes the names it creates once they are no longer needed; this is done since the startup file is executed in the same namespace as the interactive commands, and removing the names avoids creating side effects in the interactive environments. You may find it convenient -to keep some of the imported modules, such as \module{os}, which turn +to keep some of the imported modules, such as +\ulink{\module{os}}{../lib/module-os.html}, which turn out to be needed in most sessions with the interpreter. \begin{verbatim} -- 2.47.3