From: Raymond Hettinger <python@rcn.com>
Date: Mon, 28 May 2007 05:23:22 +0000 (+0000)
Subject: Explain when groupby() issues a new group.
X-Git-Tag: v2.6a1~1682
X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=1749a1353217329b5cba90c502377b5cb05fe40b;p=thirdparty%2FPython%2Fcpython.git

Explain when groupby() issues a new group.
---

diff --git a/Doc/lib/libitertools.tex b/Doc/lib/libitertools.tex
index ac6028b31a74..e2f0f0ef9a97 100644
--- a/Doc/lib/libitertools.tex
+++ b/Doc/lib/libitertools.tex
@@ -138,6 +138,13 @@ by functions or loops that truncate the stream.
   identity function and returns  the element unchanged.  Generally, the
   iterable needs to already be sorted on the same key function.
 
+  The operation of \function{groupby()} is similar to the \code{uniq} filter
+  in \UNIX{}.  It generates a break or new group every time the value
+  of the key function changes (which is why it is usually necessary
+  to have sorted the data using the same key function).  That behavior
+  differs from SQL's GROUP BY which aggregates common elements regardless
+  of their input order.
+
   The returned group is itself an iterator that shares the underlying
   iterable with \function{groupby()}.  Because the source is shared, when
   the \function{groupby} object is advanced, the previous group is no
@@ -147,6 +154,7 @@ by functions or loops that truncate the stream.
   \begin{verbatim}
     groups = []
     uniquekeys = []
+    data = sorted(data, key=keyfunc)
     for k, g in groupby(data, keyfunc):
         groups.append(list(g))      # Store group iterator as a list
         uniquekeys.append(k)