\usepackage[T1]{fontenc}
% Things to do:
-% Add a section on file I/O
-% Write a chapter entitled ``Some Useful Modules''
-% --re, math+cmath
% Should really move the Python startup file info to an appendix
\title{Python Tutorial}
possible to directly write Unicode string literals in the selected
encoding. The list of possible encodings can be found in the
\citetitle[../lib/lib.html]{Python Library Reference}, in the section
-on \module{codecs}.
+on \ulink{\module{codecs}}{../lib/module-codecs.html}.
-If your editor supports saving files as \code{UTF-8} with an UTF-8
-signature (aka BOM -- Byte Order Mark), you can use that instead of an
+If your editor supports saving files as \code{UTF-8} with a UTF-8
+\emph{byte order mark} (aka BOM), you can use that instead of an
encoding declaration. IDLE supports this capability if
\code{Options/General/Default Source Encoding/UTF-8} is set. Notice
that this signature is not understood in older Python releases (2.2
and earlier), and also not understood by the operating system for
-\code{\#!} files.
+\code{\#!} files.
By using UTF-8 (either through the signature or an encoding
declaration), characters of most languages in the world can be used
-simultaneously in string literals and comments. Using non-ASCII
+simultaneously in string literals and comments. Using non-\ASCII
characters in identifiers is not supported. To display all these
characters properly, your editor must recognize that the file is
UTF-8, and it must use a font that supports all the characters in the
expressions:
\begin{verbatim}
->>> import string
>>> 'str' 'ing' # <- This is ok
'string'
->>> string.strip('str') + 'ing' # <- This is ok
+>>> 'str'.strip() + 'ing' # <- This is ok
'string'
->>> string.strip('str') 'ing' # <- This is invalid
+>>> 'str'.strip() 'ing' # <- This is invalid
File "<stdin>", line 1, in ?
- string.strip('str') 'ing'
- ^
+ 'str'.strip() 'ing'
+ ^
SyntaxError: invalid syntax
\end{verbatim}
\end{verbatim}
+\begin{seealso}
+ \seetitle[../lib/typesseq.html]{Sequence Types}%
+ {Strings, and the Unicode strings described in the next
+ section, are examples of \emph{sequence types}, and
+ support the common operations supported by such types.}
+ \seetitle[../lib/string-methods.html]{String Methods}%
+ {Both strings and Unicode strings support a large number of
+ methods for basic transformations and searching.}
+ \seetitle[../lib/typesseq-strings.html]{String Formatting Operations}%
+ {The formatting operations invoked when strings and Unicode
+ strings are the left operand of the \code{\%} operator are
+ described in more detail here.}
+\end{seealso}
+
+
\subsection{Unicode Strings \label{unicodeStrings}}
\sectionauthor{Marc-Andre Lemburg}{mal@lemburg.com}
\emph{Latin-1}, \emph{ASCII}, \emph{UTF-8}, and \emph{UTF-16}.
The latter two are variable-length encodings that store each Unicode
character in one or more bytes. The default encoding is
-normally set to ASCII, which passes through characters in the range
+normally set to \ASCII, which passes through characters in the range
0 to 127 and rejects any other characters with an error.
When a Unicode string is printed, written to a file, or converted
with \function{str()}, conversion takes place using this default encoding.
\end{verbatim}
When a final formal parameter of the form \code{**\var{name}} is
-present, it receives a dictionary containing all keyword arguments
+present, it receives a \ulink{dictionary}{../lib/typesmapping.html} containing all keyword arguments
whose keyword doesn't correspond to a formal parameter. This may be
combined with a formal parameter of the form
\code{*\var{name}} (described in the next subsection) which receives a
More than one sequence may be passed; the function must then have as
many arguments as there are sequences and is called with the
corresponding item from each sequence (or \code{None} if some sequence
-is shorter than another). If \code{None} is passed for the function,
-a function returning its argument(s) is substituted.
-
-Combining these two special cases, we see that
-\samp{map(None, \var{list1}, \var{list2})} is a convenient way of
-turning a pair of lists into a list of pairs. For example:
+is shorter than another). For example:
\begin{verbatim}
>>> seq = range(8)
->>> def square(x): return x*x
+>>> def add(x, y): return x+y
...
->>> map(None, seq, map(square, seq))
-[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)]
+>>> map(add, seq, seq)
+[0, 2, 4, 6, 8, 10, 12, 14]
\end{verbatim}
\samp{reduce(\var{func}, \var{sequence})} returns a single value
We saw that lists and strings have many common properties, such as
indexing and slicing operations. They are two examples of
-\emph{sequence} data types. Since Python is an evolving language,
-other sequence data types may be added. There is also another
-standard sequence data type: the \emph{tuple}.
+\ulink{\emph{sequence} data types}{../lib/typesseq.html}. Since
+Python is an evolving language, other sequence data types may be
+added. There is also another standard sequence data type: the
+\emph{tuple}.
A tuple consists of a number of values separated by commas, for
instance:
\section{Dictionaries \label{dictionaries}}
-Another useful data type built into Python is the \emph{dictionary}.
+Another useful data type built into Python is the
+\ulink{\emph{dictionary}}{../lib/typesmapping.html}.
Dictionaries are sometimes found in other languages as ``associative
memories'' or ``associative arrays''. Unlike sequences, which are
indexed by a range of numbers, dictionaries are indexed by \emph{keys},
associated with that key is forgotten. It is an error to extract a
value using a non-existent key.
-The \code{keys()} method of a dictionary object returns a list of all
+The \method{keys()} method of a dictionary object returns a list of all
the keys used in the dictionary, in random order (if you want it
-sorted, just apply the \code{sort()} method to the list of keys). To
+sorted, just apply the \method{sort()} method to the list of keys). To
check whether a single key is in the dictionary, use the
-\code{has_key()} method of the dictionary.
+\method{has_key()} method of the dictionary.
Here is a small example using a dictionary:
What is your favorite color? It is blue.
\end{verbatim}
+To loop over a sequence in reverse, first specify the sequence
+in a forward direction and then call the \function{reversed()}
+function.
+
+\begin{verbatim}
+>>> for i in reversed(xrange(1,10,2)):
+... print i
+...
+9
+7
+5
+3
+1
+\end{verbatim}
+
\section{More on Conditions \label{conditions}}
script not have the same name as a standard module, or Python will
attempt to load the script as a module when that module is imported.
This will generally be an error. See section~\ref{standardModules},
-``Standard Modules.'' for more information.
+``Standard Modules,'' for more information.
\subsection{``Compiled'' Python files}
engineer.
\item
-The module \module{compileall}\refstmodindex{compileall} can create
-\file{.pyc} files (or \file{.pyo} files when \programopt{-O} is used) for
-all modules in a directory.
+The module \ulink{\module{compileall}}{../lib/module-compileall.html}%
+{} \refstmodindex{compileall} can create \file{.pyc} files (or
+\file{.pyo} files when \programopt{-O} is used) for all modules in a
+directory.
\end{itemize}
also depends on the underlying platform For example,
the \module{amoeba} module is only provided on systems that somehow
support Amoeba primitives. One particular module deserves some
-attention: \module{sys}\refstmodindex{sys}, which is built into every
+attention: \ulink{\module{sys}}{../lib/module-sys.html}%
+\refstmodindex{sys}, which is built into every
Python interpreter. The variables \code{sys.ps1} and
\code{sys.ps2} define the strings used as primary and secondary
prompts:
\subsection{Intra-package References}
The submodules often need to refer to each other. For example, the
-\module{surround} module might use the \module{echo} module. In fact, such references
-are so common that the \code{import} statement first looks in the
+\module{surround} module might use the \module{echo} module. In fact,
+such references
+are so common that the \keyword{import} statement first looks in the
containing package before looking in the standard module search path.
Thus, the surround module can simply use \code{import echo} or
\code{from echo import echofilter}. If the imported module is not
found in the current package (the package of which the current module
-is a submodule), the \code{import} statement looks for a top-level module
-with the given name.
+is a submodule), the \keyword{import} statement looks for a top-level
+module with the given name.
When packages are structured into subpackages (as with the
\module{Sound} package in the example), there's no shortcut to refer
in the \module{Sound.Effects} package, it can use \code{from
Sound.Effects import echo}.
-%(One could design a notation to refer to parent packages, similar to
-%the use of ".." to refer to the parent directory in \UNIX{} and Windows
-%filesystems. In fact, the \module{ni} module, which was the
-%ancestor of this package system, supported this using \code{__} for
-%the package containing the current module,
-%\code{__.__} for the parent package, and so on. This feature was dropped
-%because of its awkwardness; since most packages will have a relative
-%shallow substructure, this is no big loss.)
-
\subsection{Packages in Multiple Directories}
Packages support one more special attribute, \member{__path__}. This
Here are two ways to write a table of squares and cubes:
\begin{verbatim}
->>> import string
>>> for x in range(1, 11):
-... print string.rjust(repr(x), 2), string.rjust(repr(x*x), 3),
+... print repr(x).rjust(2), repr(x*x).rjust(3),
... # Note trailing comma on previous line
-... print string.rjust(repr(x*x*x), 4)
+... print repr(x*x*x).rjust(4)
...
1 1 1
2 4 8
(Note that one space between each column was added by the way
\keyword{print} works: it always adds spaces between its arguments.)
-This example demonstrates the function \function{string.rjust()},
+This example demonstrates the \method{rjust()} method of string objects,
which right-justifies a string in a field of a given width by padding
-it with spaces on the left. There are similar functions
-\function{string.ljust()} and \function{string.center()}. These
-functions do not write anything, they just return a new string. If
+it with spaces on the left. There are similar methods
+\method{ljust()} and \method{center()}. These
+methods do not write anything, they just return a new string. If
the input string is too long, they don't truncate it, but return it
unchanged; this will mess up your column lay-out but that's usually
better than the alternative, which would be lying about a value. (If
you really want truncation you can always add a slice operation, as in
-\samp{string.ljust(x,~n)[0:n]}.)
+\samp{x.ljust(~n)[:n]}.)
-There is another function, \function{string.zfill()}, which pads a
+There is another method, \method{zfill()}, which pads a
numeric string on the left with zeros. It understands about plus and
minus signs:
\begin{verbatim}
->>> import string
->>> string.zfill('12', 5)
+>>> '12'.zfill(5)
'00012'
->>> string.zfill('-3.14', 7)
+>>> '-3.14'.zfill(7)
'-003.14'
->>> string.zfill('3.14159265359', 5)
+>>> '3.14159265359'.zfill(5)
'3.14159265359'
\end{verbatim}
Strings can easily be written to and read from a file. Numbers take a
bit more effort, since the \method{read()} method only returns
strings, which will have to be passed to a function like
-\function{string.atoi()}, which takes a string like \code{'123'} and
+\function{int()}, which takes a string like \code{'123'} and
returns its numeric value 123. However, when you want to save more
complex data types like lists, dictionaries, or class instances,
things get a lot more complicated.
Rather than have users be constantly writing and debugging code to
save complicated data types, Python provides a standard module called
-\module{pickle}. This is an amazing module that can take almost
+\ulink{\module{pickle}}{../lib/module-pickle.html}. This is an
+amazing module that can take almost
any Python object (even some forms of Python code!), and convert it to
a string representation; this process is called \dfn{pickling}.
Reconstructing the object from the string representation is called
(There are other variants of this, used when pickling many objects or
when you don't want to write the pickled data to a file; consult the
-complete documentation for \module{pickle} in the Library Reference.)
-
-\module{pickle} is the standard way to make Python objects which can
-be stored and reused by other programs or by a future invocation of
-the same program; the technical term for this is a
-\dfn{persistent} object. Because \module{pickle} is so widely used,
+complete documentation for
+\ulink{\module{pickle}}{../lib/module-pickle.html} in the
+\citetitle[../lib/]{Python Library Reference}.)
+
+\ulink{\module{pickle}}{../lib/module-pickle.html} is the standard way
+to make Python objects which can be stored and reused by other
+programs or by a future invocation of the same program; the technical
+term for this is a \dfn{persistent} object. Because
+\ulink{\module{pickle}}{../lib/module-pickle.html} is so widely used,
many authors who write Python extensions take care to ensure that new
data types such as matrices can be properly pickled and unpickled.
handle the exception as well):
\begin{verbatim}
-import string, sys
+import sys
try:
f = open('myfile.txt')
s = f.readline()
- i = int(string.strip(s))
+ i = int(s.strip())
except IOError, (errno, strerror):
print "I/O error(%s): %s" % (errno, strerror)
except ValueError:
\begin{verbatim}
>>> import os
->>> os.system('copy /data/mydata.fil /backup/mydata.fil')
+>>> os.system('time 0:02')
0
>>> os.getcwd() # Return the current working directory
'C:\\Python24'
The \ulink{\module{re}}{../lib/module-re.html}
module provides regular expression tools for advanced string processing.
-When only simple capabilities are needed, string methods are preferred
-because they are easier to read and debug. However, for more
-sophisticated applications, regular expressions can provide succinct,
+For complex matching and manipulation, regular expressions offer succinct,
optimized solutions:
\begin{verbatim}
'cat in the hat'
\end{verbatim}
+When only simple capabilities are needed, string methods are preferred
+because they are easier to read and debug:
+
+\begin{verbatim}
+>>> 'tea for too'.replace('too', 'two')
+'tea for two'
+\end{verbatim}
\section{Mathematics\label{mathematics}}
informal site is \url{http://starship.python.net/}, which contains a
bunch of Python-related personal home pages; many people have
downloadable software there. Many more user-created Python modules
-can be found in a third-party repository at
-\url{http://www.vex.net/parnassus}.
+can be found in the \ulink{Python Package
+Index}{http://www.python.org/pypi} (PyPI).
For Python-related questions and problem reports, you can post to the
newsgroup \newsgroup{comp.lang.python}, or send them to the mailing
% days = 116.9 msgs / day and steadily increasing.
asking (and answering) questions, suggesting new features, and
announcing new modules. Before posting, be sure to check the list of
-Frequently Asked Questions (also called the FAQ), at
-\url{http://www.python.org/doc/FAQ.html}, or look for it in the
+\ulink{Frequently Asked Questions}{http://www.python.org/doc/faq/} (also called the FAQ), or look for it in the
\file{Misc/} directory of the Python source distribution. Mailing
list archives are available at \url{http://www.python.org/pipermail/}.
The FAQ answers many of the questions that come up again and again,
\end{verbatim}
in your \file{\~{}/.inputrc}. (Of course, this makes it harder to
-type indented continuation lines.)
+type indented continuation lines if you're accustomed to using
+\kbd{Tab} for that purpose.)
Automatic completion of variable and module names is optionally
available. To enable it in the interpreter's interactive mode, add
is done since the startup file is executed in the same namespace as
the interactive commands, and removing the names avoids creating side
effects in the interactive environments. You may find it convenient
-to keep some of the imported modules, such as \module{os}, which turn
+to keep some of the imported modules, such as
+\ulink{\module{os}}{../lib/module-os.html}, which turn
out to be needed in most sessions with the interpreter.
\begin{verbatim}