From: Raymond Hettinger Date: Mon, 23 Feb 2009 20:48:09 +0000 (+0000) Subject: Sync-up itertools docs with Py3.1 version. X-Git-Tag: 3.0~367 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=3ebf7c688e5db88fd055ef3d90ed6ddd6bf3cc34;p=thirdparty%2FPython%2Fcpython.git Sync-up itertools docs with Py3.1 version. * Merge 69722: rewrite introduction and add summary table * Merge 69740: generalize tee() recipe * Fix tab alignment on recipes to match other recipes --- diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index c523d15a6991..14a348cf20a4 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -13,39 +13,59 @@ from itertools import * -This module implements a number of :term:`iterator` building blocks inspired by -constructs from the Haskell and SML programming languages. Each has been recast -in a form suitable for Python. +This module implements a number of :term:`iterator` building blocks inspired +by constructs from APL, Haskell, and SML. Each has been recast in a form +suitable for Python. The module standardizes a core set of fast, memory efficient tools that are -useful by themselves or in combination. Standardization helps avoid the -readability and reliability problems which arise when many different individuals -create their own slightly varying implementations, each with their own quirks -and naming conventions. - -The tools are designed to combine readily with one another. This makes it easy -to construct more specialized tools succinctly and efficiently in pure Python. +useful by themselves or in combination. Together, they form an "iterator +algebra" making it possible to construct specialized tools succinctly and +efficiently in pure Python. For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python by combining :func:`map` and :func:`count` to form ``map(f, count())``. -Likewise, the functional tools are designed to work well with the high-speed -functions provided by the :mod:`operator` module. +The tools also work well with the high-speed functions in the :mod:`operator` +module. For example, the plus-operator can be mapped across two vectors to +form an efficient dot-product: ``sum(map(operator.add, vector1, vector2))``. + -Whether cast in pure python form or compiled code, tools that use iterators are -more memory efficient (and often faster) than their list based counterparts. Adopting -the principles of just-in-time manufacturing, they create data when and where -needed instead of consuming memory with the computer equivalent of "inventory". +**Infinite Iterators:** + ================== ================= ================================================= + Iterator Arguments Results + ================== ================= ================================================= + :func:`count` start start, start+1, start+2*1, ... + :func:`cycle` p p0, p1, ... plast, p0, p1, ... + :func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times + ================== ================= ================================================= -.. seealso:: +**Iterators terminating on the shortest input sequence:** - The Standard ML Basis Library, `The Standard ML Basis Library - `_. + ==================== ============================ ================================================= + Iterator Arguments Results + ==================== ============================ ================================================= + :func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... + :func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails + :func:`filterfalse` pred, seq elements of seq where pred(elem) is False + :func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v) + :func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] + :func:`starmap` func, seq func(\*seq[0]), fun(\*seq[1]), ... + :func:`tee` it, n it1, it2 , ... itn splits one iterator into n + :func:`takewhile` pred, seq seq[0], seq[1], until pred fails + :func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... + ==================== ============================ ================================================= - Haskell, A Purely Functional Language, `Definition of Haskell and the Standard - Libraries `_. +**Combinatoric generators:** + + ===================================== ==================== ================================================= + Iterator Arguments Results + ===================================== ==================== ================================================= + :func:`product` p, q, ... [repeat=1] cartesian product + :func:`permutations` p[, r] r-length permutations (without repeated elements) + :func:`combinations` p[, r] r-length combinations (sorted and no repeats) + ===================================== ==================== ================================================= .. _itertools-functions: @@ -108,7 +128,7 @@ loops that truncate the stream. return indices = list(range(r)) yield tuple(pool[i] for i in indices) - while 1: + while True: for i in reversed(range(r)): if indices[i] != i + n - r: break @@ -415,28 +435,28 @@ loops that truncate the stream. .. function:: tee(iterable[, n=2]) - Return *n* independent iterators from a single iterable. The case where ``n==2`` - is equivalent to:: + Return *n* independent iterators from a single iterable. Equivalent to:: - def tee(iterable): - def gen(next, data={}): - for i in count(): - if i in data: - yield data.pop(i) - else: - data[i] = next() - yield data[i] - it = iter(iterable) - return (gen(it.__next__), gen(it.__next__)) + def tee(iterable, n=2): + it = iter(iterable) + deques = [collections.deque() for i in range(n)] + def gen(mydeque): + while True: + if not mydeque: # when the local deque is empty + newval = next(it) # fetch a new value and + for d in deques: # load it to all the deques + d.append(newval) + yield mydeque.popleft() + return tuple(gen(d) for d in deques) - Note, once :func:`tee` has made a split, the original *iterable* should not be - used anywhere else; otherwise, the *iterable* could get advanced without the tee - objects being informed. + Once :func:`tee` has made a split, the original *iterable* should not be + used anywhere else; otherwise, the *iterable* could get advanced without + the tee objects being informed. - Note, this member of the toolkit may require significant auxiliary storage - (depending on how much temporary data needs to be stored). In general, if one - iterator is going to use most or all of the data before the other iterator, it - is faster to use :func:`list` instead of :func:`tee`. + This itertool may require significant auxiliary storage (depending on how + much temporary data needs to be stored). In general, if one iterator uses + most or all of the data before another iterator starts, it is faster to use + :func:`list` instead of :func:`tee`. .. function:: zip_longest(*iterables[, fillvalue]) @@ -612,26 +632,26 @@ which incur interpreter overhead. indices[i:] = [indices[i] + 1] * (r - i) yield tuple(pool[i] for i in indices) - def unique_everseen(iterable, key=None): - "List unique elements, preserving order. Remember all elements ever seen." - # unique_everseen('AAAABBBCCDAABBB') --> A B C D - # unique_everseen('ABBCcAD', str.lower) --> A B C D - seen = set() - seen_add = seen.add - if key is None: - for element in iterable: - if element not in seen: - seen_add(element) - yield element - else: - for element in iterable: - k = key(element) - if k not in seen: - seen_add(k) - yield element - - def unique_justseen(iterable, key=None): - "List unique elements, preserving order. Remember only the element just seen." - # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B - # unique_justseen('ABBCcAD', str.lower) --> A B C A D - return map(next, map(itemgetter(1), groupby(iterable, key))) + def unique_everseen(iterable, key=None): + "List unique elements, preserving order. Remember all elements ever seen." + # unique_everseen('AAAABBBCCDAABBB') --> A B C D + # unique_everseen('ABBCcAD', str.lower) --> A B C D + seen = set() + seen_add = seen.add + if key is None: + for element in iterable: + if element not in seen: + seen_add(element) + yield element + else: + for element in iterable: + k = key(element) + if k not in seen: + seen_add(k) + yield element + + def unique_justseen(iterable, key=None): + "List unique elements, preserving order. Remember only the element just seen." + # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B + # unique_justseen('ABBCcAD', str.lower) --> A B C A D + return map(next, imap(itemgetter(1), groupby(iterable, key)))