]> git.ipfire.org Git - thirdparty/postgresql.git/commitdiff
Implement Eager Aggregation master github/master
authorRichard Guo <rguo@postgresql.org>
Wed, 8 Oct 2025 08:04:23 +0000 (17:04 +0900)
committerRichard Guo <rguo@postgresql.org>
Wed, 8 Oct 2025 08:04:23 +0000 (17:04 +0900)
Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo <guofenglinux@gmail.com>
Author: Antonin Houska <ah@cybertec.at> (in an older version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version)
Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version)
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version)
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com

26 files changed:
contrib/postgres_fdw/expected/postgres_fdw.out
doc/src/sgml/config.sgml
src/backend/optimizer/README
src/backend/optimizer/geqo/geqo_eval.c
src/backend/optimizer/path/allpaths.c
src/backend/optimizer/path/joinrels.c
src/backend/optimizer/plan/initsplan.c
src/backend/optimizer/plan/planmain.c
src/backend/optimizer/plan/planner.c
src/backend/optimizer/util/appendinfo.c
src/backend/optimizer/util/relnode.c
src/backend/utils/misc/guc_parameters.dat
src/backend/utils/misc/postgresql.conf.sample
src/include/nodes/pathnodes.h
src/include/optimizer/pathnode.h
src/include/optimizer/paths.h
src/include/optimizer/planmain.h
src/test/regress/expected/collate.icu.utf8.out
src/test/regress/expected/eager_aggregate.out [new file with mode: 0644]
src/test/regress/expected/join.out
src/test/regress/expected/partition_aggregate.out
src/test/regress/expected/sysviews.out
src/test/regress/parallel_schedule
src/test/regress/sql/eager_aggregate.sql [new file with mode: 0644]
src/test/regress/sql/partition_aggregate.sql
src/tools/pgindent/typedefs.list

index f2f8130af878521d12fa3c2b866f1fcc65c4bb30..91bbd0d8c7347eefc9ec0b3c0ecb6fae95564763 100644 (file)
@@ -3701,30 +3701,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
index e9b420f3ddbeff0c63f16f6c59eb9bbcba304f2a..39e658b7808be56f34ec42a05a696dd91863d18a 100644 (file)
@@ -5475,6 +5475,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
@@ -6095,6 +6110,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-min-eager-agg-group-size" xreflabel="min_eager_agg_group_size">
+      <term><varname>min_eager_agg_group_size</varname> (<type>floating point</type>)
+      <indexterm>
+       <primary><varname>min_eager_agg_group_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Sets the minimum average group size required to consider applying
+        eager aggregation. This helps avoid the overhead of eager
+        aggregation when it does not offer significant row count reduction.
+        The default is <literal>8</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
       <term><varname>jit_above_cost</varname> (<type>floating point</type>)
       <indexterm>
index 843368096fd0d243dfea24cf2a0d0ee6dac2e794..6c35baceedb2fbe34d138ff10b4b6f9f5d76ff71 100644 (file)
@@ -1500,3 +1500,113 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, let's first consider the
+case where only inner joins are involved.  In this case, we partition
+the tables in the FROM clause into two groups: those that contain at
+least one aggregation column, and those that do not contain any
+aggregation columns.  Each group can be treated as a single relation
+formed by the Cartesian product of the tables within that group.
+Therefore, without loss of generality, we can assume that the FROM
+clause contains exactly two relations, R1 and R2, where R1 represents
+the relation containing all aggregation columns, and R2 represents the
+relation without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+In the case where there are any outer joins, the situation becomes
+more complex due to join order constraints and the semantics of
+null-extension in outer joins.  If the relations that contain at least
+one aggregation column cannot be treated as a single relation because
+of the join order constraints, partial aggregation paths will not be
+generated, and thus the transformation is not applicable.  Otherwise,
+let R1 be the relation containing all aggregation columns, and R2, R3,
+... be the remaining relations.  From the inner join case, under the
+aforementioned conditions, we have the equivalence:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 JOIN R3 ...)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 JOIN R3 ...)
+
+To preserve correctness when outer joins are involved, we require an
+additional condition:
+
+4) R1 must not be on the nullable side of any outer join.
+
+This condition ensures that partial aggregation over R1 does not
+suppress any null-extended rows that would be introduced by outer
+joins.  If R1 is on the nullable side of an outer join, the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
index 7fcb1aa70d1c35261df002bda78a10b385f227aa..8005754c8c633f293f59a62ca2a005aaac5bdd1f 100644 (file)
@@ -264,6 +264,9 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
                        /* Keep searching if join order is not valid */
                        if (joinrel)
                        {
+                               bool            is_top_rel = bms_equal(joinrel->relids,
+                                                                                                  root->all_query_rels);
+
                                /* Create paths for partitionwise joins. */
                                generate_partitionwise_join_paths(root, joinrel);
 
@@ -273,12 +276,28 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
                                 * rel once we know the final targetlist (see
                                 * grouping_planner).
                                 */
-                               if (!bms_equal(joinrel->relids, root->all_query_rels))
+                               if (!is_top_rel)
                                        generate_useful_gather_paths(root, joinrel, false);
 
                                /* Find and save the cheapest paths for this joinrel */
                                set_cheapest(joinrel);
 
+                               /*
+                                * Except for the topmost scan/join rel, consider generating
+                                * partial aggregation paths for the grouped relation on top
+                                * of the paths of this rel.  After that, we're done creating
+                                * paths for the grouped relation, so run set_cheapest().
+                                */
+                               if (joinrel->grouped_rel != NULL && !is_top_rel)
+                               {
+                                       RelOptInfo *grouped_rel = joinrel->grouped_rel;
+
+                                       Assert(IS_GROUPED_REL(grouped_rel));
+
+                                       generate_grouped_paths(root, grouped_rel, joinrel);
+                                       set_cheapest(grouped_rel);
+                               }
+
                                /* Absorb new clump into old */
                                old_clump->joinrel = joinrel;
                                old_clump->size += new_clump->size;
index 1f82239b4e0b7b91292014a3f21420da27b9c639..ec27c1a49941aad81405b9e7972e9aa67b7466ba 100644 (file)
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,7 +79,9 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool           enable_geqo = false;    /* just in case GUC doesn't set it */
+bool           enable_eager_aggregate = true;
 int                    geqo_threshold;
+double         min_eager_agg_group_size;
 int                    min_parallel_table_scan_size;
 int                    min_parallel_index_scan_size;
 
@@ -90,6 +94,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_simple_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
                                                 Index rti, RangeTblEntry *rte);
@@ -114,6 +119,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
                                                                Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
                                                                        Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
                                                                                 List *live_childrels,
                                                                                 List *all_child_pathkeys);
@@ -182,6 +188,12 @@ make_one_rel(PlannerInfo *root, List *joinlist)
         */
        set_base_rel_sizes(root);
 
+       /*
+        * Build grouped relations for simple rels (i.e., base or "other" member
+        * relations) where possible.
+        */
+       setup_simple_grouped_rels(root);
+
        /*
         * We should now have size estimates for every actual table involved in
         * the query, and we also know which if any have been deleted from the
@@ -323,6 +335,39 @@ set_base_rel_sizes(PlannerInfo *root)
        }
 }
 
+/*
+ * setup_simple_grouped_rels
+ *       For each simple relation, build a grouped simple relation if eager
+ *       aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_simple_grouped_rels(PlannerInfo *root)
+{
+       Index           rti;
+
+       /*
+        * If there are no aggregate expressions or grouping expressions, eager
+        * aggregation is not possible.
+        */
+       if (root->agg_clause_list == NIL ||
+               root->group_expr_list == NIL)
+               return;
+
+       for (rti = 1; rti < root->simple_rel_array_size; rti++)
+       {
+               RelOptInfo *rel = root->simple_rel_array[rti];
+
+               /* there may be empty slots corresponding to non-baserel RTEs */
+               if (rel == NULL)
+                       continue;
+
+               Assert(rel->relid == rti);      /* sanity check on array */
+               Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+               (void) build_simple_grouped_rel(root, rel);
+       }
+}
+
 /*
  * set_base_rel_pathlists
  *       Finds all paths available for scanning each base-relation entry.
@@ -559,6 +604,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
        /* Now find the cheapest of the paths for this rel */
        set_cheapest(rel);
 
+       /*
+        * If a grouped relation for this rel exists, build partial aggregation
+        * paths for it.
+        *
+        * Note that this can only happen after we've called set_cheapest() for
+        * this base rel, because we need its cheapest paths.
+        */
+       set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
        pprint(rel);
 #endif
@@ -1305,6 +1359,35 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
        add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *       If a grouped relation for the given 'rel' exists, build partial
+ *       aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+       RelOptInfo *grouped_rel;
+
+       /*
+        * If there are no aggregate expressions or grouping expressions, eager
+        * aggregation is not possible.
+        */
+       if (root->agg_clause_list == NIL ||
+               root->group_expr_list == NIL)
+               return;
+
+       /* Add paths to the grouped base relation if one exists. */
+       grouped_rel = rel->grouped_rel;
+       if (grouped_rel)
+       {
+               Assert(IS_GROUPED_REL(grouped_rel));
+
+               generate_grouped_paths(root, grouped_rel, rel);
+               set_cheapest(grouped_rel);
+       }
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3334,6 +3417,345 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
        }
 }
 
+/*
+ * generate_grouped_paths
+ *             Generate paths for a grouped relation by adding sorted and hashed
+ *             partial aggregation paths on top of paths of the ungrouped relation.
+ *
+ * The information needed is provided by the RelAggInfo structure stored in
+ * "grouped_rel".
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+                                          RelOptInfo *rel)
+{
+       RelAggInfo *agg_info = grouped_rel->agg_info;
+       AggClauseCosts agg_costs;
+       bool            can_hash;
+       bool            can_sort;
+       Path       *cheapest_total_path = NULL;
+       Path       *cheapest_partial_path = NULL;
+       double          dNumGroups = 0;
+       double          dNumPartialGroups = 0;
+       List       *group_pathkeys = NIL;
+
+       if (IS_DUMMY_REL(rel))
+       {
+               mark_dummy_rel(grouped_rel);
+               return;
+       }
+
+       /*
+        * We push partial aggregation only to the lowest possible level in the
+        * join tree that is deemed useful.
+        */
+       if (!bms_equal(agg_info->apply_at, rel->relids) ||
+               !agg_info->agg_useful)
+               return;
+
+       MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+       get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+       /*
+        * Determine whether it's possible to perform sort-based implementations
+        * of grouping, and generate the pathkeys that represent the grouping
+        * requirements in that case.
+        */
+       can_sort = grouping_is_sortable(agg_info->group_clauses);
+       if (can_sort)
+       {
+               RelOptInfo *top_grouped_rel;
+               List       *top_group_tlist;
+
+               top_grouped_rel = IS_OTHER_REL(rel) ?
+                       rel->top_parent->grouped_rel : grouped_rel;
+               top_group_tlist =
+                       make_tlist_from_pathtarget(top_grouped_rel->agg_info->target);
+
+               group_pathkeys =
+                       make_pathkeys_for_sortclauses(root, agg_info->group_clauses,
+                                                                                 top_group_tlist);
+       }
+
+       /*
+        * Determine whether we should consider hash-based implementations of
+        * grouping.
+        */
+       Assert(root->numOrderedAggs == 0);
+       can_hash = (agg_info->group_clauses != NIL &&
+                               grouping_is_hashable(agg_info->group_clauses));
+
+       /*
+        * Consider whether we should generate partially aggregated non-partial
+        * paths.  We can only do this if we have a non-partial path.
+        */
+       if (rel->pathlist != NIL)
+       {
+               cheapest_total_path = rel->cheapest_total_path;
+               Assert(cheapest_total_path != NULL);
+       }
+
+       /*
+        * If parallelism is possible for grouped_rel, then we should consider
+        * generating partially-grouped partial paths.  However, if the ungrouped
+        * rel has no partial paths, then we can't.
+        */
+       if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+       {
+               cheapest_partial_path = linitial(rel->partial_pathlist);
+               Assert(cheapest_partial_path != NULL);
+       }
+
+       /* Estimate number of partial groups. */
+       if (cheapest_total_path != NULL)
+               dNumGroups = estimate_num_groups(root,
+                                                                                agg_info->group_exprs,
+                                                                                cheapest_total_path->rows,
+                                                                                NULL, NULL);
+       if (cheapest_partial_path != NULL)
+               dNumPartialGroups = estimate_num_groups(root,
+                                                                                               agg_info->group_exprs,
+                                                                                               cheapest_partial_path->rows,
+                                                                                               NULL, NULL);
+
+       if (can_sort && cheapest_total_path != NULL)
+       {
+               ListCell   *lc;
+
+               /*
+                * Use any available suitably-sorted path as input, and also consider
+                * sorting the cheapest-total path and incremental sort on any paths
+                * with presorted keys.
+                *
+                * To save planning time, we ignore parameterized input paths unless
+                * they are the cheapest-total path.
+                */
+               foreach(lc, rel->pathlist)
+               {
+                       Path       *input_path = (Path *) lfirst(lc);
+                       Path       *path;
+                       bool            is_sorted;
+                       int                     presorted_keys;
+
+                       /*
+                        * Ignore parameterized paths that are not the cheapest-total
+                        * path.
+                        */
+                       if (input_path->param_info &&
+                               input_path != cheapest_total_path)
+                               continue;
+
+                       is_sorted = pathkeys_count_contained_in(group_pathkeys,
+                                                                                                       input_path->pathkeys,
+                                                                                                       &presorted_keys);
+
+                       /*
+                        * Ignore paths that are not suitably or partially sorted, unless
+                        * they are the cheapest total path (no need to deal with paths
+                        * which have presorted keys when incremental sort is disabled).
+                        */
+                       if (!is_sorted && input_path != cheapest_total_path &&
+                               (presorted_keys == 0 || !enable_incremental_sort))
+                               continue;
+
+                       /*
+                        * Since the path originates from a non-grouped relation that is
+                        * not aware of eager aggregation, we must ensure that it provides
+                        * the correct input for partial aggregation.
+                        */
+                       path = (Path *) create_projection_path(root,
+                                                                                                  grouped_rel,
+                                                                                                  input_path,
+                                                                                                  agg_info->agg_input);
+
+                       if (!is_sorted)
+                       {
+                               /*
+                                * We've no need to consider both a sort and incremental sort.
+                                * We'll just do a sort if there are no presorted keys and an
+                                * incremental sort when there are presorted keys.
+                                */
+                               if (presorted_keys == 0 || !enable_incremental_sort)
+                                       path = (Path *) create_sort_path(root,
+                                                                                                        grouped_rel,
+                                                                                                        path,
+                                                                                                        group_pathkeys,
+                                                                                                        -1.0);
+                               else
+                                       path = (Path *) create_incremental_sort_path(root,
+                                                                                                                                grouped_rel,
+                                                                                                                                path,
+                                                                                                                                group_pathkeys,
+                                                                                                                                presorted_keys,
+                                                                                                                                -1.0);
+                       }
+
+                       /*
+                        * qual is NIL because the HAVING clause cannot be evaluated until
+                        * the final value of the aggregate is known.
+                        */
+                       path = (Path *) create_agg_path(root,
+                                                                                       grouped_rel,
+                                                                                       path,
+                                                                                       agg_info->target,
+                                                                                       AGG_SORTED,
+                                                                                       AGGSPLIT_INITIAL_SERIAL,
+                                                                                       agg_info->group_clauses,
+                                                                                       NIL,
+                                                                                       &agg_costs,
+                                                                                       dNumGroups);
+
+                       add_path(grouped_rel, path);
+               }
+       }
+
+       if (can_sort && cheapest_partial_path != NULL)
+       {
+               ListCell   *lc;
+
+               /* Similar to above logic, but for partial paths. */
+               foreach(lc, rel->partial_pathlist)
+               {
+                       Path       *input_path = (Path *) lfirst(lc);
+                       Path       *path;
+                       bool            is_sorted;
+                       int                     presorted_keys;
+
+                       is_sorted = pathkeys_count_contained_in(group_pathkeys,
+                                                                                                       input_path->pathkeys,
+                                                                                                       &presorted_keys);
+
+                       /*
+                        * Ignore paths that are not suitably or partially sorted, unless
+                        * they are the cheapest partial path (no need to deal with paths
+                        * which have presorted keys when incremental sort is disabled).
+                        */
+                       if (!is_sorted && input_path != cheapest_partial_path &&
+                               (presorted_keys == 0 || !enable_incremental_sort))
+                               continue;
+
+                       /*
+                        * Since the path originates from a non-grouped relation that is
+                        * not aware of eager aggregation, we must ensure that it provides
+                        * the correct input for partial aggregation.
+                        */
+                       path = (Path *) create_projection_path(root,
+                                                                                                  grouped_rel,
+                                                                                                  input_path,
+                                                                                                  agg_info->agg_input);
+
+                       if (!is_sorted)
+                       {
+                               /*
+                                * We've no need to consider both a sort and incremental sort.
+                                * We'll just do a sort if there are no presorted keys and an
+                                * incremental sort when there are presorted keys.
+                                */
+                               if (presorted_keys == 0 || !enable_incremental_sort)
+                                       path = (Path *) create_sort_path(root,
+                                                                                                        grouped_rel,
+                                                                                                        path,
+                                                                                                        group_pathkeys,
+                                                                                                        -1.0);
+                               else
+                                       path = (Path *) create_incremental_sort_path(root,
+                                                                                                                                grouped_rel,
+                                                                                                                                path,
+                                                                                                                                group_pathkeys,
+                                                                                                                                presorted_keys,
+                                                                                                                                -1.0);
+                       }
+
+                       /*
+                        * qual is NIL because the HAVING clause cannot be evaluated until
+                        * the final value of the aggregate is known.
+                        */
+                       path = (Path *) create_agg_path(root,
+                                                                                       grouped_rel,
+                                                                                       path,
+                                                                                       agg_info->target,
+                                                                                       AGG_SORTED,
+                                                                                       AGGSPLIT_INITIAL_SERIAL,
+                                                                                       agg_info->group_clauses,
+                                                                                       NIL,
+                                                                                       &agg_costs,
+                                                                                       dNumPartialGroups);
+
+                       add_partial_path(grouped_rel, path);
+               }
+       }
+
+       /*
+        * Add a partially-grouped HashAgg Path where possible
+        */
+       if (can_hash && cheapest_total_path != NULL)
+       {
+               Path       *path;
+
+               /*
+                * Since the path originates from a non-grouped relation that is not
+                * aware of eager aggregation, we must ensure that it provides the
+                * correct input for partial aggregation.
+                */
+               path = (Path *) create_projection_path(root,
+                                                                                          grouped_rel,
+                                                                                          cheapest_total_path,
+                                                                                          agg_info->agg_input);
+
+               /*
+                * qual is NIL because the HAVING clause cannot be evaluated until the
+                * final value of the aggregate is known.
+                */
+               path = (Path *) create_agg_path(root,
+                                                                               grouped_rel,
+                                                                               path,
+                                                                               agg_info->target,
+                                                                               AGG_HASHED,
+                                                                               AGGSPLIT_INITIAL_SERIAL,
+                                                                               agg_info->group_clauses,
+                                                                               NIL,
+                                                                               &agg_costs,
+                                                                               dNumGroups);
+
+               add_path(grouped_rel, path);
+       }
+
+       /*
+        * Now add a partially-grouped HashAgg partial Path where possible
+        */
+       if (can_hash && cheapest_partial_path != NULL)
+       {
+               Path       *path;
+
+               /*
+                * Since the path originates from a non-grouped relation that is not
+                * aware of eager aggregation, we must ensure that it provides the
+                * correct input for partial aggregation.
+                */
+               path = (Path *) create_projection_path(root,
+                                                                                          grouped_rel,
+                                                                                          cheapest_partial_path,
+                                                                                          agg_info->agg_input);
+
+               /*
+                * qual is NIL because the HAVING clause cannot be evaluated until the
+                * final value of the aggregate is known.
+                */
+               path = (Path *) create_agg_path(root,
+                                                                               grouped_rel,
+                                                                               path,
+                                                                               agg_info->target,
+                                                                               AGG_HASHED,
+                                                                               AGGSPLIT_INITIAL_SERIAL,
+                                                                               agg_info->group_clauses,
+                                                                               NIL,
+                                                                               &agg_costs,
+                                                                               dNumPartialGroups);
+
+               add_partial_path(grouped_rel, path);
+       }
+}
+
 /*
  * make_rel_from_joinlist
  *       Build access paths using a "joinlist" to guide the join path search.
@@ -3493,11 +3915,19 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
                 *
                 * After that, we're done creating paths for the joinrel, so run
                 * set_cheapest().
+                *
+                * In addition, we also run generate_grouped_paths() for the grouped
+                * relation of each just-processed joinrel, and run set_cheapest() for
+                * the grouped relation afterwards.
                 */
                foreach(lc, root->join_rel_level[lev])
                {
+                       bool            is_top_rel;
+
                        rel = (RelOptInfo *) lfirst(lc);
 
+                       is_top_rel = bms_equal(rel->relids, root->all_query_rels);
+
                        /* Create paths for partitionwise joins. */
                        generate_partitionwise_join_paths(root, rel);
 
@@ -3507,12 +3937,28 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
                         * once we know the final targetlist (see grouping_planner's and
                         * its call to apply_scanjoin_target_to_paths).
                         */
-                       if (!bms_equal(rel->relids, root->all_query_rels))
+                       if (!is_top_rel)
                                generate_useful_gather_paths(root, rel, false);
 
                        /* Find and save the cheapest paths for this rel */
                        set_cheapest(rel);
 
+                       /*
+                        * Except for the topmost scan/join rel, consider generating
+                        * partial aggregation paths for the grouped relation on top of
+                        * the paths of this rel.  After that, we're done creating paths
+                        * for the grouped relation, so run set_cheapest().
+                        */
+                       if (rel->grouped_rel != NULL && !is_top_rel)
+                       {
+                               RelOptInfo *grouped_rel = rel->grouped_rel;
+
+                               Assert(IS_GROUPED_REL(grouped_rel));
+
+                               generate_grouped_paths(root, grouped_rel, rel);
+                               set_cheapest(grouped_rel);
+                       }
+
 #ifdef OPTIMIZER_DEBUG
                        pprint(rel);
 #endif
@@ -4382,6 +4828,25 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
                if (IS_DUMMY_REL(child_rel))
                        continue;
 
+               /*
+                * Except for the topmost scan/join rel, consider generating partial
+                * aggregation paths for the grouped relation on top of the paths of
+                * this partitioned child-join.  After that, we're done creating paths
+                * for the grouped relation, so run set_cheapest().
+                */
+               if (child_rel->grouped_rel != NULL &&
+                       !bms_equal(IS_OTHER_REL(rel) ?
+                                          rel->top_parent_relids : rel->relids,
+                                          root->all_query_rels))
+               {
+                       RelOptInfo *grouped_rel = child_rel->grouped_rel;
+
+                       Assert(IS_GROUPED_REL(grouped_rel));
+
+                       generate_grouped_paths(root, grouped_rel, child_rel);
+                       set_cheapest(grouped_rel);
+               }
+
 #ifdef OPTIMIZER_DEBUG
                pprint(child_rel);
 #endif
index 535248aa525161fd23ca5d2343becc25d57bdf49..43b84d239ed4e425c60765912b9001dfcea08dbd 100644 (file)
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -36,6 +37,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
                                                                                  RelOptInfo *joinrel,
                                                                                  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+                                                                 RelOptInfo *rel2, RelOptInfo *joinrel,
+                                                                 SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
                                                                                RelOptInfo *rel2, RelOptInfo *joinrel,
                                                                                SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -762,6 +766,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
                return joinrel;
        }
 
+       /* Build a grouped join relation for 'joinrel' if possible. */
+       make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+                                                 restrictlist);
+
        /* Add paths to the join relation. */
        populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
                                                                restrictlist);
@@ -873,6 +881,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
        return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *       Build a grouped join relation for the given "joinrel" if eager
+ *       aggregation is applicable and the resulting grouped paths are considered
+ *       useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+                                         RelOptInfo *rel2, RelOptInfo *joinrel,
+                                         SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+       RelOptInfo *grouped_rel;
+       RelOptInfo *grouped_rel1;
+       RelOptInfo *grouped_rel2;
+       bool            rel1_empty;
+       bool            rel2_empty;
+       Relids          agg_apply_at;
+
+       /*
+        * If there are no aggregate expressions or grouping expressions, eager
+        * aggregation is not possible.
+        */
+       if (root->agg_clause_list == NIL ||
+               root->group_expr_list == NIL)
+               return;
+
+       /* Retrieve the grouped relations for the two input rels */
+       grouped_rel1 = rel1->grouped_rel;
+       grouped_rel2 = rel2->grouped_rel;
+
+       rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+       rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+       /* Find or construct a grouped joinrel for this joinrel */
+       grouped_rel = joinrel->grouped_rel;
+       if (grouped_rel == NULL)
+       {
+               RelAggInfo *agg_info = NULL;
+
+               /*
+                * Prepare the information needed to create grouped paths for this
+                * join relation.
+                */
+               agg_info = create_rel_agg_info(root, joinrel, rel1_empty == rel2_empty);
+               if (agg_info == NULL)
+                       return;
+
+               /*
+                * If grouped paths for the given join relation are not considered
+                * useful, and no grouped paths can be built by joining grouped input
+                * relations, skip building the grouped join relation.
+                */
+               if (!agg_info->agg_useful &&
+                       (rel1_empty == rel2_empty))
+                       return;
+
+               /* build the grouped relation */
+               grouped_rel = build_grouped_rel(root, joinrel);
+               grouped_rel->reltarget = agg_info->target;
+
+               if (rel1_empty != rel2_empty)
+               {
+                       /*
+                        * If there is exactly one grouped input relation, then we can
+                        * build grouped paths by joining the input relations.  Set size
+                        * estimates for the grouped join relation based on the input
+                        * relations, and update the set of relids where partial
+                        * aggregation is applied to that of the grouped input relation.
+                        */
+                       set_joinrel_size_estimates(root, grouped_rel,
+                                                                          rel1_empty ? rel1 : grouped_rel1,
+                                                                          rel2_empty ? rel2 : grouped_rel2,
+                                                                          sjinfo, restrictlist);
+                       agg_info->apply_at = rel1_empty ?
+                               grouped_rel2->agg_info->apply_at :
+                               grouped_rel1->agg_info->apply_at;
+               }
+               else
+               {
+                       /*
+                        * Otherwise, grouped paths can be built by applying partial
+                        * aggregation on top of existing non-grouped join paths.  Set
+                        * size estimates for the grouped join relation based on the
+                        * estimated number of groups, and track the set of relids where
+                        * partial aggregation is applied.  Note that these values may be
+                        * updated later if it is determined that grouped paths can be
+                        * constructed by joining other input relations.
+                        */
+                       grouped_rel->rows = agg_info->grouped_rows;
+                       agg_info->apply_at = bms_copy(joinrel->relids);
+               }
+
+               grouped_rel->agg_info = agg_info;
+               joinrel->grouped_rel = grouped_rel;
+       }
+
+       Assert(IS_GROUPED_REL(grouped_rel));
+
+       /* We may have already proven this grouped join relation to be dummy. */
+       if (IS_DUMMY_REL(grouped_rel))
+               return;
+
+       /*
+        * Nothing to do if there's no grouped input relation.  Also, joining two
+        * grouped relations is not currently supported.
+        */
+       if (rel1_empty == rel2_empty)
+               return;
+
+       /*
+        * Get the set of relids where partial aggregation is applied among the
+        * given input relations.
+        */
+       agg_apply_at = rel1_empty ?
+               grouped_rel2->agg_info->apply_at :
+               grouped_rel1->agg_info->apply_at;
+
+       /*
+        * If it's not the designated level, skip building grouped paths.
+        *
+        * One exception is when it is a subset of the previously recorded level.
+        * In that case, we need to update the designated level to this one, and
+        * adjust the size estimates for the grouped join relation accordingly.
+        * For example, suppose partial aggregation can be applied on top of (B
+        * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+        * record the designated level as including all three relations (A B C).
+        * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+        * (B C) join level directly.  Since this is a subset of the previous
+        * level and still valid for partial aggregation, we update the designated
+        * level to (B C), and adjust the size estimates accordingly.
+        */
+       if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+       {
+               if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+               {
+                       /* Adjust the size estimates for the grouped join relation. */
+                       set_joinrel_size_estimates(root, grouped_rel,
+                                                                          rel1_empty ? rel1 : grouped_rel1,
+                                                                          rel2_empty ? rel2 : grouped_rel2,
+                                                                          sjinfo, restrictlist);
+                       grouped_rel->agg_info->apply_at = agg_apply_at;
+               }
+               else
+                       return;
+       }
+
+       /* Make paths for the grouped join relation. */
+       populate_joinrel_with_paths(root,
+                                                               rel1_empty ? rel1 : grouped_rel1,
+                                                               rel2_empty ? rel2 : grouped_rel2,
+                                                               grouped_rel,
+                                                               sjinfo,
+                                                               restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *       Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
                                                 adjust_child_relids(joinrel->relids,
                                                                                         nappinfos, appinfos)));
 
+               /* Build a grouped join relation for 'child_joinrel' if possible */
+               make_grouped_join_rel(root, child_rel1, child_rel2,
+                                                         child_joinrel, child_sjinfo,
+                                                         child_restrictlist);
+
                /* And make paths for the child join */
                populate_joinrel_with_paths(root, child_rel1, child_rel2,
                                                                        child_joinrel, child_sjinfo,
index 3e3fec8925291f795075cfc3cb23981f48b15392..b8d1c7e88a3ef0eff73a39e84cfda1d89df91dd8 100644 (file)
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -31,6 +32,7 @@
 #include "optimizer/restrictinfo.h"
 #include "parser/analyze.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/typcache.h"
@@ -81,6 +83,12 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool is_partial_agg_memory_risky(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
+static EquivalenceClass *get_eclass_for_sortgroupclause(PlannerInfo *root,
+                                                                                                               SortGroupClause *sgc,
+                                                                                                               Expr *expr);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
                                                                           Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +636,368 @@ remove_useless_groupby_columns(PlannerInfo *root)
        }
 }
 
+/*
+ * setup_eager_aggregation
+ *       Check if eager aggregation is applicable, and if so collect suitable
+ *       aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+       /*
+        * Don't apply eager aggregation if disabled by user.
+        */
+       if (!enable_eager_aggregate)
+               return;
+
+       /*
+        * Don't apply eager aggregation if there are no available GROUP BY
+        * clauses.
+        */
+       if (!root->processed_groupClause)
+               return;
+
+       /*
+        * For now we don't try to support grouping sets.
+        */
+       if (root->parse->groupingSets)
+               return;
+
+       /*
+        * For now we don't try to support DISTINCT or ORDER BY aggregates.
+        */
+       if (root->numOrderedAggs > 0)
+               return;
+
+       /*
+        * If there are any aggregates that do not support partial mode, or any
+        * partial aggregates that are non-serializable, do not apply eager
+        * aggregation.
+        */
+       if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+               return;
+
+       /*
+        * We don't try to apply eager aggregation if there are set-returning
+        * functions in targetlist.
+        */
+       if (root->parse->hasTargetSRFs)
+               return;
+
+       /*
+        * Eager aggregation only makes sense if there are multiple base rels in
+        * the query.
+        */
+       if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+               return;
+
+       /*
+        * Don't apply eager aggregation if any aggregate poses a risk of
+        * excessive memory usage during partial aggregation.
+        */
+       if (is_partial_agg_memory_risky(root))
+               return;
+
+       /*
+        * Collect aggregate expressions and plain Vars that appear in the
+        * targetlist and havingQual.
+        */
+       create_agg_clause_infos(root);
+
+       /*
+        * If there are no suitable aggregate expressions, we cannot apply eager
+        * aggregation.
+        */
+       if (root->agg_clause_list == NIL)
+               return;
+
+       /*
+        * Collect grouping expressions that appear in grouping clauses.
+        */
+       create_grouping_expr_infos(root);
+}
+
+/*
+ * is_partial_agg_memory_risky
+ *       Check if any aggregate poses a risk of excessive memory usage during
+ *       partial aggregation.
+ *
+ * We check if any aggregate has a negative aggtransspace value, which
+ * indicates that its transition state data can grow unboundedly in size.
+ * Applying eager aggregation in such cases risks high memory usage since
+ * partial aggregation results might be stored in join hash tables or
+ * materialized nodes.
+ */
+static bool
+is_partial_agg_memory_risky(PlannerInfo *root)
+{
+       ListCell   *lc;
+
+       foreach(lc, root->aggtransinfos)
+       {
+               AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+               if (transinfo->aggtransspace < 0)
+                       return true;
+       }
+
+       return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *       Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *       create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+       List       *tlist_exprs;
+       List       *agg_clause_list = NIL;
+       List       *tlist_vars = NIL;
+       Relids          aggregate_relids = NULL;
+       bool            eager_agg_applicable = true;
+       ListCell   *lc;
+
+       Assert(root->agg_clause_list == NIL);
+       Assert(root->tlist_vars == NIL);
+
+       tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+                                                                 PVC_INCLUDE_AGGREGATES |
+                                                                 PVC_RECURSE_WINDOWFUNCS |
+                                                                 PVC_RECURSE_PLACEHOLDERS);
+
+       /*
+        * Aggregates within the HAVING clause need to be processed in the same
+        * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+        * but not WindowFuncs.
+        */
+       if (root->parse->havingQual != NULL)
+       {
+               List       *having_exprs;
+
+               having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+                                                                          PVC_INCLUDE_AGGREGATES |
+                                                                          PVC_RECURSE_PLACEHOLDERS);
+               if (having_exprs != NIL)
+               {
+                       tlist_exprs = list_concat(tlist_exprs, having_exprs);
+                       list_free(having_exprs);
+               }
+       }
+
+       foreach(lc, tlist_exprs)
+       {
+               Expr       *expr = (Expr *) lfirst(lc);
+               Aggref     *aggref;
+               Relids          agg_eval_at;
+               AggClauseInfo *ac_info;
+
+               /* For now we don't try to support GROUPING() expressions */
+               if (IsA(expr, GroupingFunc))
+               {
+                       eager_agg_applicable = false;
+                       break;
+               }
+
+               /* Collect plain Vars for future reference */
+               if (IsA(expr, Var))
+               {
+                       tlist_vars = list_append_unique(tlist_vars, expr);
+                       continue;
+               }
+
+               aggref = castNode(Aggref, expr);
+
+               Assert(aggref->aggorder == NIL);
+               Assert(aggref->aggdistinct == NIL);
+
+               /*
+                * If there are any securityQuals, do not try to apply eager
+                * aggregation if any non-leakproof aggregate functions are present.
+                * This is overly strict, but for now...
+                */
+               if (root->qual_security_level > 0 &&
+                       !get_func_leakproof(aggref->aggfnoid))
+               {
+                       eager_agg_applicable = false;
+                       break;
+               }
+
+               agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+               /*
+                * If all base relations in the query are referenced by aggregate
+                * functions, then eager aggregation is not applicable.
+                */
+               aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+               if (bms_is_subset(root->all_baserels, aggregate_relids))
+               {
+                       eager_agg_applicable = false;
+                       break;
+               }
+
+               /* OK, create the AggClauseInfo node */
+               ac_info = makeNode(AggClauseInfo);
+               ac_info->aggref = aggref;
+               ac_info->agg_eval_at = agg_eval_at;
+
+               /* ... and add it to the list */
+               agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+       }
+
+       list_free(tlist_exprs);
+
+       if (eager_agg_applicable)
+       {
+               root->agg_clause_list = agg_clause_list;
+               root->tlist_vars = tlist_vars;
+       }
+       else
+       {
+               list_free_deep(agg_clause_list);
+               list_free(tlist_vars);
+       }
+}
+
+/*
+ * create_grouping_expr_infos
+ *       Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+       List       *exprs = NIL;
+       List       *sortgrouprefs = NIL;
+       List       *ecs = NIL;
+       ListCell   *lc,
+                          *lc1,
+                          *lc2,
+                          *lc3;
+
+       Assert(root->group_expr_list == NIL);
+
+       foreach(lc, root->processed_groupClause)
+       {
+               SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+               TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+               TypeCacheEntry *tce;
+               Oid                     equalimageproc;
+
+               Assert(tle->ressortgroupref > 0);
+
+               /*
+                * For now we only support plain Vars as grouping expressions.
+                */
+               if (!IsA(tle->expr, Var))
+                       return;
+
+               /*
+                * Eager aggregation is only possible if equality implies image
+                * equality for each grouping key.  Otherwise, placing keys with
+                * different byte images into the same group may result in the loss of
+                * information that could be necessary to evaluate upper qual clauses.
+                *
+                * For instance, the NUMERIC data type is not supported, as values
+                * that are considered equal by the equality operator (e.g., 0 and
+                * 0.0) can have different scales.
+                */
+               tce = lookup_type_cache(exprType((Node *) tle->expr),
+                                                               TYPECACHE_BTREE_OPFAMILY);
+               if (!OidIsValid(tce->btree_opf) ||
+                       !OidIsValid(tce->btree_opintype))
+                       return;
+
+               equalimageproc = get_opfamily_proc(tce->btree_opf,
+                                                                                  tce->btree_opintype,
+                                                                                  tce->btree_opintype,
+                                                                                  BTEQUALIMAGE_PROC);
+               if (!OidIsValid(equalimageproc) ||
+                       !DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+                                                                                          tce->typcollation,
+                                                                                          ObjectIdGetDatum(tce->btree_opintype))))
+                       return;
+
+               exprs = lappend(exprs, tle->expr);
+               sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+               ecs = lappend(ecs, get_eclass_for_sortgroupclause(root, sgc, tle->expr));
+       }
+
+       /*
+        * Construct a GroupingExprInfo for each expression.
+        */
+       forthree(lc1, exprs, lc2, sortgrouprefs, lc3, ecs)
+       {
+               Expr       *expr = (Expr *) lfirst(lc1);
+               int                     sortgroupref = lfirst_int(lc2);
+               EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc3);
+               GroupingExprInfo *ge_info;
+
+               ge_info = makeNode(GroupingExprInfo);
+               ge_info->expr = (Expr *) copyObject(expr);
+               ge_info->sortgroupref = sortgroupref;
+               ge_info->ec = ec;
+
+               root->group_expr_list = lappend(root->group_expr_list, ge_info);
+       }
+}
+
+/*
+ * get_eclass_for_sortgroupclause
+ *       Given a group clause and an expression, find an existing equivalence
+ *       class that the expression is a member of; return NULL if none.
+ */
+static EquivalenceClass *
+get_eclass_for_sortgroupclause(PlannerInfo *root, SortGroupClause *sgc,
+                                                          Expr *expr)
+{
+       Oid                     opfamily,
+                               opcintype,
+                               collation;
+       CompareType cmptype;
+       Oid                     equality_op;
+       List       *opfamilies;
+
+       /* Punt if the group clause is not sortable */
+       if (!OidIsValid(sgc->sortop))
+               return NULL;
+
+       /* Find the operator in pg_amop --- failure shouldn't happen */
+       if (!get_ordering_op_properties(sgc->sortop,
+                                                                       &opfamily, &opcintype, &cmptype))
+               elog(ERROR, "operator %u is not a valid ordering operator",
+                        sgc->sortop);
+
+       /* Because SortGroupClause doesn't carry collation, consult the expr */
+       collation = exprCollation((Node *) expr);
+
+       /*
+        * EquivalenceClasses need to contain opfamily lists based on the family
+        * membership of mergejoinable equality operators, which could belong to
+        * more than one opfamily.  So we have to look up the opfamily's equality
+        * operator and get its membership.
+        */
+       equality_op = get_opfamily_member_for_cmptype(opfamily,
+                                                                                                 opcintype,
+                                                                                                 opcintype,
+                                                                                                 COMPARE_EQ);
+       if (!OidIsValid(equality_op))   /* shouldn't happen */
+               elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
+                        COMPARE_EQ, opcintype, opcintype, opfamily);
+       opfamilies = get_mergejoin_opfamilies(equality_op);
+       if (!opfamilies)                        /* certainly should find some */
+               elog(ERROR, "could not find opfamilies for equality operator %u",
+                        equality_op);
+
+       /* Now find a matching EquivalenceClass */
+       return get_eclass_for_sort_expr(root, expr, opfamilies, opcintype,
+                                                                       collation, sgc->tleSortGroupRef,
+                                                                       NULL, false);
+}
+
 /*****************************************************************************
  *
  *       LATERAL REFERENCES
index 5467e094ca7e0ac553fbc7b1c3e42a22dee195f4..eefc486a5666cf682a1844fa7e9b128487599da3 100644 (file)
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
        root->placeholder_list = NIL;
        root->placeholder_array = NULL;
        root->placeholder_array_size = 0;
+       root->agg_clause_list = NIL;
+       root->group_expr_list = NIL;
+       root->tlist_vars = NIL;
        root->fkey_list = NIL;
        root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
         */
        extract_restriction_or_clauses(root);
 
+       /*
+        * Check if eager aggregation is applicable, and if so, set up
+        * root->agg_clause_list and root->group_expr_list.
+        */
+       setup_eager_aggregation(root);
+
        /*
         * Now expand appendrels by adding "otherrels" for their children.  We
         * delay this to the end so that we have as much information as possible
index 0c9397a36c31bda96212299df847eff74c52635e..6f06174cdab063cecd366c37968817077ac8af42 100644 (file)
@@ -232,7 +232,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
                                                                          RelOptInfo *partially_grouped_rel,
                                                                          const AggClauseCosts *agg_costs,
                                                                          grouping_sets_data *gd,
-                                                                         double dNumGroups,
                                                                          GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
                                                                                                 RelOptInfo *grouped_rel,
@@ -4014,9 +4013,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
                                                           GroupPathExtraData *extra,
                                                           RelOptInfo **partially_grouped_rel_p)
 {
-       Path       *cheapest_path = input_rel->cheapest_total_path;
        RelOptInfo *partially_grouped_rel = NULL;
-       double          dNumGroups;
        PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
        /*
@@ -4098,23 +4095,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
        /* Gather any partially grouped partial paths. */
        if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-       {
                gather_grouping_paths(root, partially_grouped_rel);
-               set_cheapest(partially_grouped_rel);
-       }
 
-       /*
-        * Estimate number of groups.
-        */
-       dNumGroups = get_number_of_groups(root,
-                                                                         cheapest_path->rows,
-                                                                         gd,
-                                                                         extra->targetList);
+       /* Now choose the best path(s) for partially_grouped_rel. */
+       if (partially_grouped_rel && partially_grouped_rel->pathlist)
+               set_cheapest(partially_grouped_rel);
 
        /* Build final grouping paths */
        add_paths_to_grouping_rel(root, input_rel, grouped_rel,
                                                          partially_grouped_rel, agg_costs, gd,
-                                                         dNumGroups, extra);
+                                                         extra);
 
        /* Give a helpful error if we failed to find any implementation */
        if (grouped_rel->pathlist == NIL)
@@ -7059,16 +7049,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
                                                  RelOptInfo *grouped_rel,
                                                  RelOptInfo *partially_grouped_rel,
                                                  const AggClauseCosts *agg_costs,
-                                                 grouping_sets_data *gd, double dNumGroups,
+                                                 grouping_sets_data *gd,
                                                  GroupPathExtraData *extra)
 {
        Query      *parse = root->parse;
        Path       *cheapest_path = input_rel->cheapest_total_path;
+       Path       *cheapest_partially_grouped_path = NULL;
        ListCell   *lc;
        bool            can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
        bool            can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
        List       *havingQual = (List *) extra->havingQual;
        AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+       double          dNumGroups = 0;
+       double          dNumFinalGroups = 0;
+
+       /*
+        * Estimate number of groups for non-split aggregation.
+        */
+       dNumGroups = get_number_of_groups(root,
+                                                                         cheapest_path->rows,
+                                                                         gd,
+                                                                         extra->targetList);
+
+       if (partially_grouped_rel && partially_grouped_rel->pathlist)
+       {
+               cheapest_partially_grouped_path =
+                       partially_grouped_rel->cheapest_total_path;
+
+               /*
+                * Estimate number of groups for final phase of partial aggregation.
+                */
+               dNumFinalGroups =
+                       get_number_of_groups(root,
+                                                                cheapest_partially_grouped_path->rows,
+                                                                gd,
+                                                                extra->targetList);
+       }
 
        if (can_sort)
        {
@@ -7181,7 +7197,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
                                        path = make_ordered_path(root,
                                                                                         grouped_rel,
                                                                                         path,
-                                                                                        partially_grouped_rel->cheapest_total_path,
+                                                                                        cheapest_partially_grouped_path,
                                                                                         info->pathkeys,
                                                                                         -1.0);
 
@@ -7199,7 +7215,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
                                                                                                 info->clauses,
                                                                                                 havingQual,
                                                                                                 agg_final_costs,
-                                                                                                dNumGroups));
+                                                                                                dNumFinalGroups));
                                        else
                                                add_path(grouped_rel, (Path *)
                                                                 create_group_path(root,
@@ -7207,7 +7223,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
                                                                                                   path,
                                                                                                   info->clauses,
                                                                                                   havingQual,
-                                                                                                  dNumGroups));
+                                                                                                  dNumFinalGroups));
 
                                }
                        }
@@ -7249,19 +7265,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
                 */
                if (partially_grouped_rel && partially_grouped_rel->pathlist)
                {
-                       Path       *path = partially_grouped_rel->cheapest_total_path;
-
                        add_path(grouped_rel, (Path *)
                                         create_agg_path(root,
                                                                         grouped_rel,
-                                                                        path,
+                                                                        cheapest_partially_grouped_path,
                                                                         grouped_rel->reltarget,
                                                                         AGG_HASHED,
                                                                         AGGSPLIT_FINAL_DESERIAL,
                                                                         root->processed_groupClause,
                                                                         havingQual,
                                                                         agg_final_costs,
-                                                                        dNumGroups));
+                                                                        dNumFinalGroups));
                }
        }
 
@@ -7301,6 +7315,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
        Query      *parse = root->parse;
        RelOptInfo *partially_grouped_rel;
+       RelOptInfo *eager_agg_rel = NULL;
        AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
        AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
        Path       *cheapest_partial_path = NULL;
@@ -7311,6 +7326,15 @@ create_partial_grouping_paths(PlannerInfo *root,
        bool            can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
        bool            can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+       /*
+        * Check whether any partially aggregated paths have been generated
+        * through eager aggregation.
+        */
+       if (input_rel->grouped_rel &&
+               !IS_DUMMY_REL(input_rel->grouped_rel) &&
+               input_rel->grouped_rel->pathlist != NIL)
+               eager_agg_rel = input_rel->grouped_rel;
+
        /*
         * Consider whether we should generate partially aggregated non-partial
         * paths.  We can only do this if we have a non-partial path, and only if
@@ -7332,11 +7356,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
        /*
         * If we can't partially aggregate partial paths, and we can't partially
-        * aggregate non-partial paths, then don't bother creating the new
+        * aggregate non-partial paths, and no partially aggregated paths were
+        * generated by eager aggregation, then don't bother creating the new
         * RelOptInfo at all, unless the caller specified force_rel_creation.
         */
        if (cheapest_total_path == NULL &&
                cheapest_partial_path == NULL &&
+               eager_agg_rel == NULL &&
                !force_rel_creation)
                return NULL;
 
@@ -7561,6 +7587,51 @@ create_partial_grouping_paths(PlannerInfo *root,
                                                                                 dNumPartialPartialGroups));
        }
 
+       /*
+        * Add any partially aggregated paths generated by eager aggregation to
+        * the new upper relation after applying projection steps as needed.
+        */
+       if (eager_agg_rel)
+       {
+               /* Add the paths */
+               foreach(lc, eager_agg_rel->pathlist)
+               {
+                       Path       *path = (Path *) lfirst(lc);
+
+                       /* Shouldn't have any parameterized paths anymore */
+                       Assert(path->param_info == NULL);
+
+                       path = (Path *) create_projection_path(root,
+                                                                                                  partially_grouped_rel,
+                                                                                                  path,
+                                                                                                  partially_grouped_rel->reltarget);
+
+                       add_path(partially_grouped_rel, path);
+               }
+
+               /*
+                * Likewise add the partial paths, but only if parallelism is possible
+                * for partially_grouped_rel.
+                */
+               if (partially_grouped_rel->consider_parallel)
+               {
+                       foreach(lc, eager_agg_rel->partial_pathlist)
+                       {
+                               Path       *path = (Path *) lfirst(lc);
+
+                               /* Shouldn't have any parameterized paths anymore */
+                               Assert(path->param_info == NULL);
+
+                               path = (Path *) create_projection_path(root,
+                                                                                                          partially_grouped_rel,
+                                                                                                          path,
+                                                                                                          partially_grouped_rel->reltarget);
+
+                               add_partial_path(partially_grouped_rel, path);
+                       }
+               }
+       }
+
        /*
         * If there is an FDW that's responsible for all baserels of the query,
         * let it consider adding partially grouped ForeignPaths.
@@ -8124,13 +8195,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
                add_paths_to_append_rel(root, partially_grouped_rel,
                                                                partially_grouped_live_children);
-
-               /*
-                * We need call set_cheapest, since the finalization step will use the
-                * cheapest path from the rel.
-                */
-               if (partially_grouped_rel->pathlist)
-                       set_cheapest(partially_grouped_rel);
        }
 
        /* If possible, create append paths for fully grouped children. */
index 5b3dc0d865399b469819619d5bb7c18384ac7dd1..69b8b0c2ae05685d1a45dc694ee7735e45b9ee3f 100644 (file)
@@ -516,6 +516,57 @@ adjust_appendrel_attrs_mutator(Node *node,
                return (Node *) newinfo;
        }
 
+       /*
+        * We have to process RelAggInfo nodes specially.
+        */
+       if (IsA(node, RelAggInfo))
+       {
+               RelAggInfo *oldinfo = (RelAggInfo *) node;
+               RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+               newinfo->target = (PathTarget *)
+                       adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+                                                                                  context);
+
+               newinfo->agg_input = (PathTarget *)
+                       adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+                                                                                  context);
+
+               newinfo->group_clauses = oldinfo->group_clauses;
+
+               newinfo->group_exprs = (List *)
+                       adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+                                                                                  context);
+
+               return (Node *) newinfo;
+       }
+
+       /*
+        * We have to process PathTarget nodes specially.
+        */
+       if (IsA(node, PathTarget))
+       {
+               PathTarget *oldtarget = (PathTarget *) node;
+               PathTarget *newtarget = makeNode(PathTarget);
+
+               /* Copy all flat-copiable fields */
+               memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+               newtarget->exprs = (List *)
+                       adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+                                                                                  context);
+
+               if (oldtarget->sortgrouprefs)
+               {
+                       Size            nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+                       newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+                       memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+               }
+
+               return (Node *) newtarget;
+       }
+
        /*
         * NOTE: we do not need to recurse into sublinks, because they should
         * already have been converted to subplans before we see them.
index 0e523d2eb5b4486fb2b47ea8463199a3fdb21829..cf1bc6721370097ec4b205ef943606d551c60342 100644 (file)
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,6 +89,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
                                                                           RelOptInfo *childrel,
                                                                           int nappinfos,
                                                                           AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+                                                                                                       RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+                                                                 PathTarget *target, PathTarget *agg_input,
+                                                                 List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
 
 /*
@@ -278,6 +292,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
        rel->joininfo = NIL;
        rel->has_eclass_joins = false;
        rel->consider_partitionwise_join = false;       /* might get changed later */
+       rel->agg_info = NULL;
+       rel->grouped_rel = NULL;
        rel->part_scheme = NULL;
        rel->nparts = -1;
        rel->boundinfo = NULL;
@@ -408,6 +424,103 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
        return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *       Construct a new RelOptInfo representing a grouped version of the input
+ *       simple relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+       RelOptInfo *grouped_rel;
+       RelAggInfo *agg_info;
+
+       /*
+        * We should have available aggregate expressions and grouping
+        * expressions, otherwise we cannot reach here.
+        */
+       Assert(root->agg_clause_list != NIL);
+       Assert(root->group_expr_list != NIL);
+
+       /* nothing to do for dummy rel */
+       if (IS_DUMMY_REL(rel))
+               return NULL;
+
+       /*
+        * Prepare the information needed to create grouped paths for this simple
+        * relation.
+        */
+       agg_info = create_rel_agg_info(root, rel, true);
+       if (agg_info == NULL)
+               return NULL;
+
+       /*
+        * If grouped paths for the given simple relation are not considered
+        * useful, skip building the grouped relation.
+        */
+       if (!agg_info->agg_useful)
+               return NULL;
+
+       /* Track the set of relids at which partial aggregation is applied */
+       agg_info->apply_at = bms_copy(rel->relids);
+
+       /* build the grouped relation */
+       grouped_rel = build_grouped_rel(root, rel);
+       grouped_rel->reltarget = agg_info->target;
+       grouped_rel->rows = agg_info->grouped_rows;
+       grouped_rel->agg_info = agg_info;
+
+       rel->grouped_rel = grouped_rel;
+
+       return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *       Build a grouped relation by flat copying the input relation and resetting
+ *       the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+       RelOptInfo *grouped_rel;
+
+       grouped_rel = makeNode(RelOptInfo);
+       memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+       /*
+        * clear path info
+        */
+       grouped_rel->pathlist = NIL;
+       grouped_rel->ppilist = NIL;
+       grouped_rel->partial_pathlist = NIL;
+       grouped_rel->cheapest_startup_path = NULL;
+       grouped_rel->cheapest_total_path = NULL;
+       grouped_rel->cheapest_parameterized_paths = NIL;
+
+       /*
+        * clear partition info
+        */
+       grouped_rel->part_scheme = NULL;
+       grouped_rel->nparts = -1;
+       grouped_rel->boundinfo = NULL;
+       grouped_rel->partbounds_merged = false;
+       grouped_rel->partition_qual = NIL;
+       grouped_rel->part_rels = NULL;
+       grouped_rel->live_parts = NULL;
+       grouped_rel->all_partrels = NULL;
+       grouped_rel->partexprs = NULL;
+       grouped_rel->nullable_partexprs = NULL;
+       grouped_rel->consider_partitionwise_join = false;
+
+       /*
+        * clear size estimates
+        */
+       grouped_rel->rows = 0;
+
+       return grouped_rel;
+}
+
 /*
  * find_base_rel
  *       Find a base or otherrel relation entry, which must already exist.
@@ -759,6 +872,8 @@ build_join_rel(PlannerInfo *root,
        joinrel->joininfo = NIL;
        joinrel->has_eclass_joins = false;
        joinrel->consider_partitionwise_join = false;   /* might get changed later */
+       joinrel->agg_info = NULL;
+       joinrel->grouped_rel = NULL;
        joinrel->parent = NULL;
        joinrel->top_parent = NULL;
        joinrel->top_parent_relids = NULL;
@@ -945,6 +1060,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
        joinrel->joininfo = NIL;
        joinrel->has_eclass_joins = false;
        joinrel->consider_partitionwise_join = false;   /* might get changed later */
+       joinrel->agg_info = NULL;
+       joinrel->grouped_rel = NULL;
        joinrel->parent = parent_joinrel;
        joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
        joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2523,3 +2640,536 @@ build_child_join_reltarget(PlannerInfo *root,
        childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
        childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *       Create the RelAggInfo structure for the given relation if it can produce
+ *       grouped paths.  The given relation is the non-grouped one which has the
+ *       reltarget already constructed.
+ *
+ * calculate_grouped_rows: if true, calculate the estimated number of grouped
+ * rows for the relation.  If false, skip the estimation to avoid unnecessary
+ * planning overhead.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel,
+                                       bool calculate_grouped_rows)
+{
+       ListCell   *lc;
+       RelAggInfo *result;
+       PathTarget *agg_input;
+       PathTarget *target;
+       List       *group_clauses = NIL;
+       List       *group_exprs = NIL;
+
+       /*
+        * The lists of aggregate expressions and grouping expressions should have
+        * been constructed.
+        */
+       Assert(root->agg_clause_list != NIL);
+       Assert(root->group_expr_list != NIL);
+
+       /*
+        * If this is a child rel, the grouped rel for its parent rel must have
+        * been created if it can.  So we can just use parent's RelAggInfo if
+        * there is one, with appropriate variable substitutions.
+        */
+       if (IS_OTHER_REL(rel))
+       {
+               RelOptInfo *grouped_rel;
+               RelAggInfo *agg_info;
+
+               grouped_rel = rel->top_parent->grouped_rel;
+               if (grouped_rel == NULL)
+                       return NULL;
+
+               Assert(IS_GROUPED_REL(grouped_rel));
+
+               /* Must do multi-level transformation */
+               agg_info = (RelAggInfo *)
+                       adjust_appendrel_attrs_multilevel(root,
+                                                                                         (Node *) grouped_rel->agg_info,
+                                                                                         rel,
+                                                                                         rel->top_parent);
+
+               agg_info->apply_at = NULL;      /* caller will change this later */
+
+               if (calculate_grouped_rows)
+               {
+                       agg_info->grouped_rows =
+                               estimate_num_groups(root, agg_info->group_exprs,
+                                                                       rel->rows, NULL, NULL);
+
+                       /*
+                        * The grouped paths for the given relation are considered useful
+                        * iff the average group size is no less than
+                        * min_eager_agg_group_size.
+                        */
+                       agg_info->agg_useful =
+                               (rel->rows / agg_info->grouped_rows) >= min_eager_agg_group_size;
+               }
+
+               return agg_info;
+       }
+
+       /* Check if it's possible to produce grouped paths for this relation. */
+       if (!eager_aggregation_possible_for_relation(root, rel))
+               return NULL;
+
+       /*
+        * Create targets for the grouped paths and for the input paths of the
+        * grouped paths.
+        */
+       target = create_empty_pathtarget();
+       agg_input = create_empty_pathtarget();
+
+       /* ... and initialize these targets */
+       if (!init_grouping_targets(root, rel, target, agg_input,
+                                                          &group_clauses, &group_exprs))
+               return NULL;
+
+       /*
+        * Eager aggregation is not applicable if there are no available grouping
+        * expressions.
+        */
+       if (group_clauses == NIL)
+               return NULL;
+
+       /* Add aggregates to the grouping target */
+       foreach(lc, root->agg_clause_list)
+       {
+               AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+               Aggref     *aggref;
+
+               Assert(IsA(ac_info->aggref, Aggref));
+
+               aggref = (Aggref *) copyObject(ac_info->aggref);
+               mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+               add_column_to_pathtarget(target, (Expr *) aggref, 0);
+       }
+
+       /* Set the estimated eval cost and output width for both targets */
+       set_pathtarget_cost_width(root, target);
+       set_pathtarget_cost_width(root, agg_input);
+
+       /* build the RelAggInfo result */
+       result = makeNode(RelAggInfo);
+       result->target = target;
+       result->agg_input = agg_input;
+       result->group_clauses = group_clauses;
+       result->group_exprs = group_exprs;
+       result->apply_at = NULL;        /* caller will change this later */
+
+       if (calculate_grouped_rows)
+       {
+               result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+                                                                                                  rel->rows, NULL, NULL);
+
+               /*
+                * The grouped paths for the given relation are considered useful iff
+                * the average group size is no less than min_eager_agg_group_size.
+                */
+               result->agg_useful =
+                       (rel->rows / result->grouped_rows) >= min_eager_agg_group_size;
+       }
+
+       return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ *       Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+       ListCell   *lc;
+       int                     cur_relid;
+
+       /*
+        * Check to see if the given relation is in the nullable side of an outer
+        * join.  In this case, we cannot push a partial aggregation down to the
+        * relation, because the NULL-extended rows produced by the outer join
+        * would not be available when we perform the partial aggregation, while
+        * with a non-eager-aggregation plan these rows are available for the
+        * top-level aggregation.  Doing so may result in the rows being grouped
+        * differently than expected, or produce incorrect values from the
+        * aggregate functions.
+        */
+       cur_relid = -1;
+       while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+       {
+               RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+               if (baserel == NULL)
+                       continue;                       /* ignore outer joins in rel->relids */
+
+               if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+                       return false;
+       }
+
+       /*
+        * For now we don't try to support PlaceHolderVars.
+        */
+       foreach(lc, rel->reltarget->exprs)
+       {
+               Expr       *expr = lfirst(lc);
+
+               if (IsA(expr, PlaceHolderVar))
+                       return false;
+       }
+
+       /* Caller should only pass base relations or joins. */
+       Assert(rel->reloptkind == RELOPT_BASEREL ||
+                  rel->reloptkind == RELOPT_JOINREL);
+
+       /*
+        * Check if all aggregate expressions can be evaluated on this relation
+        * level.
+        */
+       foreach(lc, root->agg_clause_list)
+       {
+               AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+               Assert(IsA(ac_info->aggref, Aggref));
+
+               /*
+                * Give up if any aggregate requires relations other than the current
+                * one.  If the aggregate requires the current relation plus
+                * additional relations, grouping the current relation could make some
+                * input rows unavailable for the higher aggregate and may reduce the
+                * number of input rows it receives.  If the aggregate does not
+                * require the current relation at all, it should not be grouped, as
+                * we do not support joining two grouped relations.
+                */
+               if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+                       return false;
+       }
+
+       return true;
+}
+
+/*
+ * init_grouping_targets
+ *       Initialize the target for grouped paths (target) as well as the target
+ *       for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+                                         PathTarget *target, PathTarget *agg_input,
+                                         List **group_clauses, List **group_exprs)
+{
+       ListCell   *lc;
+       List       *possibly_dependent = NIL;
+       Index           maxSortGroupRef;
+
+       /* Identify the max sortgroupref */
+       maxSortGroupRef = 0;
+       foreach(lc, root->processed_tlist)
+       {
+               Index           ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+               if (ref > maxSortGroupRef)
+                       maxSortGroupRef = ref;
+       }
+
+       /*
+        * At this point, all Vars from this relation that are needed by upper
+        * joins or are required in the final targetlist should already be present
+        * in its reltarget.  Therefore, we can safely iterate over this
+        * relation's reltarget->exprs to construct the PathTarget and grouping
+        * clauses for the grouped paths.
+        */
+       foreach(lc, rel->reltarget->exprs)
+       {
+               Expr       *expr = (Expr *) lfirst(lc);
+               Index           sortgroupref;
+
+               /*
+                * Given that PlaceHolderVar currently prevents us from doing eager
+                * aggregation, the source target cannot contain anything more complex
+                * than a Var.
+                */
+               Assert(IsA(expr, Var));
+
+               /*
+                * Get the sortgroupref of the expr if it is found among, or can be
+                * deduced from, the original grouping expressions.
+                */
+               sortgroupref = get_expression_sortgroupref(root, expr);
+               if (sortgroupref > 0)
+               {
+                       SortGroupClause *sgc;
+
+                       /* Find the matching SortGroupClause */
+                       sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+                       Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+                       /*
+                        * If the target expression is to be used as a grouping key, it
+                        * should be emitted by the grouped paths that have been pushed
+                        * down to this relation level.
+                        */
+                       add_column_to_pathtarget(target, expr, sortgroupref);
+
+                       /*
+                        * ... and it also should be emitted by the input paths.
+                        */
+                       add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+                       /*
+                        * Record this SortGroupClause and grouping expression.  Note that
+                        * this SortGroupClause might have already been recorded.
+                        */
+                       if (!list_member(*group_clauses, sgc))
+                       {
+                               *group_clauses = lappend(*group_clauses, sgc);
+                               *group_exprs = lappend(*group_exprs, expr);
+                       }
+               }
+               else if (is_var_needed_by_join(root, (Var *) expr, rel))
+               {
+                       /*
+                        * The expression is needed for an upper join but is neither in
+                        * the GROUP BY clause nor derivable from it using EC (otherwise,
+                        * it would have already been included in the targets above).  We
+                        * need to create a special SortGroupClause for this expression.
+                        *
+                        * It is important to include such expressions in the grouping
+                        * keys.  This is essential to ensure that an aggregated row from
+                        * the partial aggregation matches the other side of the join if
+                        * and only if each row in the partial group does.  This ensures
+                        * that all rows within the same partial group share the same
+                        * 'destiny', which is crucial for maintaining correctness.
+                        */
+                       SortGroupClause *sgc;
+                       TypeCacheEntry *tce;
+                       Oid                     equalimageproc;
+
+                       /*
+                        * But first, check if equality implies image equality for this
+                        * expression.  If not, we cannot use it as a grouping key.  See
+                        * comments in create_grouping_expr_infos().
+                        */
+                       tce = lookup_type_cache(exprType((Node *) expr),
+                                                                       TYPECACHE_BTREE_OPFAMILY);
+                       if (!OidIsValid(tce->btree_opf) ||
+                               !OidIsValid(tce->btree_opintype))
+                               return false;
+
+                       equalimageproc = get_opfamily_proc(tce->btree_opf,
+                                                                                          tce->btree_opintype,
+                                                                                          tce->btree_opintype,
+                                                                                          BTEQUALIMAGE_PROC);
+                       if (!OidIsValid(equalimageproc) ||
+                               !DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+                                                                                                  tce->typcollation,
+                                                                                                  ObjectIdGetDatum(tce->btree_opintype))))
+                               return false;
+
+                       /* Create the SortGroupClause. */
+                       sgc = makeNode(SortGroupClause);
+
+                       /* Initialize the SortGroupClause. */
+                       sgc->tleSortGroupRef = ++maxSortGroupRef;
+                       get_sort_group_operators(exprType((Node *) expr),
+                                                                        false, true, false,
+                                                                        &sgc->sortop, &sgc->eqop, NULL,
+                                                                        &sgc->hashable);
+
+                       /* This expression should be emitted by the grouped paths */
+                       add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+                       /* ... and it also should be emitted by the input paths. */
+                       add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+                       /* Record this SortGroupClause and grouping expression */
+                       *group_clauses = lappend(*group_clauses, sgc);
+                       *group_exprs = lappend(*group_exprs, expr);
+               }
+               else if (is_var_in_aggref_only(root, (Var *) expr))
+               {
+                       /*
+                        * The expression is referenced by an aggregate function pushed
+                        * down to this relation and does not appear elsewhere in the
+                        * targetlist or havingQual.  Add it to 'agg_input' but not to
+                        * 'target'.
+                        */
+                       add_new_column_to_pathtarget(agg_input, expr);
+               }
+               else
+               {
+                       /*
+                        * The expression may be functionally dependent on other
+                        * expressions in the target, but we cannot verify this until all
+                        * target expressions have been constructed.
+                        */
+                       possibly_dependent = lappend(possibly_dependent, expr);
+               }
+       }
+
+       /*
+        * Now we can verify whether an expression is functionally dependent on
+        * others.
+        */
+       foreach(lc, possibly_dependent)
+       {
+               Var                *tvar;
+               List       *deps = NIL;
+               RangeTblEntry *rte;
+
+               tvar = lfirst_node(Var, lc);
+               rte = root->simple_rte_array[tvar->varno];
+
+               if (check_functional_grouping(rte->relid, tvar->varno,
+                                                                         tvar->varlevelsup,
+                                                                         target->exprs, &deps))
+               {
+                       /*
+                        * The expression is functionally dependent on other target
+                        * expressions, so it can be included in the targets.  Since it
+                        * will not be used as a grouping key, a sortgroupref is not
+                        * needed for it.
+                        */
+                       add_new_column_to_pathtarget(target, (Expr *) tvar);
+                       add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+               }
+               else
+               {
+                       /*
+                        * We may arrive here with a grouping expression that is proven
+                        * redundant by EquivalenceClass processing, such as 't1.a' in the
+                        * query below.
+                        *
+                        * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+                        * t1.b;
+                        *
+                        * For now we just give up in this case.
+                        */
+                       return false;
+               }
+       }
+
+       return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *       Check whether the given Var appears in aggregate expressions and not
+ *       elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+       ListCell   *lc;
+
+       /*
+        * Search the list of aggregate expressions for the Var.
+        */
+       foreach(lc, root->agg_clause_list)
+       {
+               AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+               List       *vars;
+
+               Assert(IsA(ac_info->aggref, Aggref));
+
+               if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+                       continue;
+
+               vars = pull_var_clause((Node *) ac_info->aggref,
+                                                          PVC_RECURSE_AGGREGATES |
+                                                          PVC_RECURSE_WINDOWFUNCS |
+                                                          PVC_RECURSE_PLACEHOLDERS);
+
+               if (list_member(vars, var))
+               {
+                       list_free(vars);
+                       break;
+               }
+
+               list_free(vars);
+       }
+
+       return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *       Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+       Relids          relids;
+       int                     attno;
+       RelOptInfo *baserel;
+
+       /*
+        * Note that when checking if the Var is needed by joins above, we want to
+        * exclude cases where the Var is only needed in the final targetlist.  So
+        * include "relation 0" in the check.
+        */
+       relids = bms_copy(rel->relids);
+       relids = bms_add_member(relids, 0);
+
+       baserel = find_base_rel(root, var->varno);
+       attno = var->varattno - baserel->min_attr;
+
+       return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *       Return the sortgroupref of the given "expr" if it is found among the
+ *       original grouping expressions, or is known equal to any of the original
+ *       grouping expressions due to equivalence relationships.  Return 0 if no
+ *       match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+       ListCell   *lc;
+
+       Assert(IsA(expr, Var));
+
+       foreach(lc, root->group_expr_list)
+       {
+               GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+               ListCell   *lc1;
+
+               Assert(IsA(ge_info->expr, Var));
+               Assert(ge_info->sortgroupref > 0);
+
+               if (equal(expr, ge_info->expr))
+                       return ge_info->sortgroupref;
+
+               if (ge_info->ec == NULL ||
+                       !bms_is_member(((Var *) expr)->varno, ge_info->ec->ec_relids))
+                       continue;
+
+               /*
+                * Scan the EquivalenceClass, looking for a match to the given
+                * expression.  We ignore child members here.
+                */
+               foreach(lc1, ge_info->ec->ec_members)
+               {
+                       EquivalenceMember *em = (EquivalenceMember *) lfirst(lc1);
+
+                       /* Child members should not exist in ec_members */
+                       Assert(!em->em_is_child);
+
+                       if (equal(expr, em->em_expr))
+                               return ge_info->sortgroupref;
+               }
+       }
+
+       /* no match is found */
+       return 0;
+}
index 6bc6be13d2ad24fa84ab815fb634df7391717d52..b176d5130e4908020b08bf05b07cc53f44954e99 100644 (file)
   boot_val => 'false',
 },
 
+{ name => 'enable_eager_aggregate', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+  short_desc => 'Enables eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'enable_eager_aggregate',
+  boot_val => 'true',
+},
+
 { name => 'enable_parallel_append', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
   short_desc => 'Enables the planner\'s use of parallel append plans.',
   flags => 'GUC_EXPLAIN',
   max => 'DBL_MAX',
 },
 
+{ name => 'min_eager_agg_group_size', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_COST',
+  short_desc => 'Sets the minimum average group size required to consider applying eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'min_eager_agg_group_size',
+  boot_val => '8.0',
+  min => '0.0',
+  max => 'DBL_MAX',
+},
+
 { name => 'cursor_tuple_fraction', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_OTHER',
   short_desc => 'Sets the planner\'s estimate of the fraction of a cursor\'s rows that will be retrieved.',
   flags => 'GUC_EXPLAIN',
index c36fcb9ab6105115c38c888f6811e03255208fa1..c5d612ab552635322b34a9d49a3f24d81c3fda6b 100644 (file)
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
 #min_parallel_table_scan_size = 8MB
 #min_parallel_index_scan_size = 512kB
 #effective_cache_size = 4GB
+#min_eager_agg_group_size = 8.0
 
 #jit_above_cost = 100000               # perform JIT compilation if available
                                        # and query more expensive than this;
index 4e3230ba23461bc011dfa62d30aa568331394f38..79408743166019b4ba8541d55033ac76b4bdb82f 100644 (file)
@@ -401,6 +401,15 @@ struct PlannerInfo
        /* list of PlaceHolderInfos */
        List       *placeholder_list;
 
+       /* list of AggClauseInfos */
+       List       *agg_clause_list;
+
+       /* list of GroupExprInfos */
+       List       *group_expr_list;
+
+       /* list of plain Vars contained in targetlist and havingQual */
+       List       *tlist_vars;
+
        /* array of PlaceHolderInfos indexed by phid */
        struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
        /* allocated size of array */
@@ -1053,6 +1062,14 @@ typedef struct RelOptInfo
        /* consider partitionwise join paths? (if partitioned rel) */
        bool            consider_partitionwise_join;
 
+       /*
+        * used by eager aggregation:
+        */
+       /* information needed to create grouped paths */
+       struct RelAggInfo *agg_info;
+       /* the partially-aggregated version of the relation */
+       struct RelOptInfo *grouped_rel;
+
        /*
         * inheritance links, if this is an otherrel (otherwise NULL):
         */
@@ -1141,6 +1158,63 @@ typedef struct RelOptInfo
        ((nominal_jointype) == JOIN_INNER && (sjinfo)->jointype == JOIN_SEMI && \
         bms_equal((sjinfo)->syn_righthand, (rel)->relids))
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+       ((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *             Information needed to create paths for a grouped relation.
+ *
+ * "target" is the default result targetlist for Paths scanning this grouped
+ * relation; list of Vars/Exprs, cost, width.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "group_clauses" and "group_exprs" are lists of SortGroupClauses and the
+ * corresponding grouping expressions.
+ *
+ * "apply_at" tracks the set of relids at which partial aggregation is applied
+ * in the paths of this grouped relation.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * min_eager_agg_group_size, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+       pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+       NodeTag         type;
+
+       /* the output tlist for the grouped paths */
+       struct PathTarget *target;
+
+       /* the output tlist for the input paths */
+       struct PathTarget *agg_input;
+
+       /* a list of SortGroupClauses */
+       List       *group_clauses;
+       /* a list of grouping expressions */
+       List       *group_exprs;
+
+       /* the set of relids partial aggregation is applied at */
+       Relids          apply_at;
+
+       /* estimated number of result tuples */
+       Cardinality grouped_rows;
+
+       /* the grouped paths are considered useful? */
+       bool            agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *             Per-index information for planning/optimization
@@ -3285,6 +3359,49 @@ typedef struct MinMaxAggInfo
        Param      *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+       pg_node_attr(no_read, no_query_jumble)
+
+       NodeTag         type;
+
+       /* the Aggref expr */
+       Aggref     *aggref;
+
+       /* lowest level we can evaluate this aggregate at */
+       Relids          agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the EquivalenceClass it belongs to.  This information is necessary to
+ * reproduce correct grouping semantics at different levels of the join tree.
+ */
+typedef struct GroupingExprInfo
+{
+       pg_node_attr(no_read, no_query_jumble)
+
+       NodeTag         type;
+
+       /* the represented expression */
+       Expr       *expr;
+
+       /* the tleSortGroupRef of the corresponding SortGroupClause */
+       Index           sortgroupref;
+
+       /* the equivalence class the expression belongs to */
+       EquivalenceClass *ec pg_node_attr(copy_as_scalar, equal_as_scalar);
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
index 763cd25bb3c9a0a4f0d243fd0b574da1693d5057..da60383c2aa449c7aa9c129243ab88ec82a0add2 100644 (file)
@@ -312,6 +312,8 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
                                                                        RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root, RelOptInfo *rel);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -351,4 +353,6 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
                                                                                SpecialJoinInfo *sjinfo,
                                                                                int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel,
+                                                                          bool calculate_grouped_rows);
 #endif                                                 /* PATHNODE_H */
index cbade77b717fb745a151b69ac1c2056050f53718..f6a62df0b43d65cea2270c052d756a0154fa6896 100644 (file)
@@ -21,7 +21,9 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
+extern PGDLLIMPORT double min_eager_agg_group_size;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
 extern PGDLLIMPORT bool enable_group_by_reordering;
@@ -57,6 +59,8 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
                                                                  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
                                                                                 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+                                                                  RelOptInfo *rel);
 extern int     compute_parallel_worker(RelOptInfo *rel, double heap_pages,
                                                                        double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
index 9d3debcab2801a5a9e67dbd6e70e965c8a6f1083..09b48b26f8f4804cf183a20899d02f64a775b999 100644 (file)
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
                                                                        Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
index 69805d4b9ec53203d5e63d96ed13cef03191869e..ef79d6f1ded964d3af5b11228cebeaa85352e1e3 100644 (file)
@@ -2437,11 +2437,11 @@ SELECT c collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" ORDER BY 1;
 SET enable_partitionwise_join TO false;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2449,10 +2449,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2464,11 +2466,11 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
 SET enable_partitionwise_join TO true;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2476,10 +2478,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644 (file)
index 0000000..fc0f8c1
--- /dev/null
@@ -0,0 +1,1714 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c)
+  FROM eager_agg_t1 t1
+  LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c)
+  FROM eager_agg_t1 t1
+  LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+--
+-- Test eager aggregation with GEQO
+--
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each
+-- partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for
+-- each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each
+-- partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for
+-- each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_tab_ml;
index 14a6d7513aa760a1ebfd46df5f435370d01f71b9..d10095de70fa1ac65387a5adde58f369fa0655b9 100644 (file)
@@ -2840,20 +2840,22 @@ select x.thousand, x.twothousand, count(*)
 from tenk1 x inner join tenk1 y on x.thousand = y.thousand
 group by x.thousand, x.twothousand
 order by x.thousand desc, x.twothousand;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- GroupAggregate
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
Finalize GroupAggregate
    Group Key: x.thousand, x.twothousand
    ->  Incremental Sort
          Sort Key: x.thousand DESC, x.twothousand
          Presorted Key: x.thousand
          ->  Merge Join
                Merge Cond: (y.thousand = x.thousand)
-               ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
+               ->  Partial GroupAggregate
+                     Group Key: y.thousand
+                     ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
                ->  Sort
                      Sort Key: x.thousand DESC
                      ->  Seq Scan on tenk1 x
-(11 rows)
+(13 rows)
 
 reset enable_hashagg;
 reset enable_nestloop;
index cb12bf537197d0059f151db1de1729d7b084f438..fc84929a0024ff0a7f49fc9f8e3aadd7e63d33bc 100644 (file)
@@ -13,6 +13,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 --
 -- Tests for list partitioned tables.
 --
index 83228cfca293fd8d30fb900d1a8d40bd354493f2..3b37fafa65b9568fcc028fb901dd79cece40ad2d 100644 (file)
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
index fbffc67ae60131999206d223c414bdeefe0df077..f9450cdc4770348c77e1509c9b6e413a0278700b 100644 (file)
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644 (file)
index 0000000..e328a83
--- /dev/null
@@ -0,0 +1,380 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c)
+  FROM eager_agg_t1 t1
+  LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t2.b ORDER BY t2.b;
+
+SELECT t2.b, avg(t2.c)
+  FROM eager_agg_t1 t1
+  LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+--
+-- Test eager aggregation with GEQO
+--
+
+SET geqo = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each
+-- partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.y ORDER BY t2.y;
+
+SELECT t2.y, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for
+-- each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+SELECT t2.x, sum(t1.x), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+
+SELECT t3.y, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each
+-- partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for
+-- each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.y ORDER BY t1.y;
+
+SELECT t1.y, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+
+SELECT t3.y, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_tab_ml;
index ab070fee24496d2dbe07b98b68e08bcbb6a3d0fd..124cc26046163d32e1d2f92b8a37b04bd0032198 100644 (file)
@@ -14,6 +14,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 
 --
 -- Tests for list partitioned tables.
index 37f26f6c6b75535ef264779edc639e1db86539ff..02b5b041c4581e4e1b17706a595d345e16831a9c 100644 (file)
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1110,6 +1111,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2473,6 +2475,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber