From a273d393b7f9b66890d9bfa55982ac4c27d729c9 Mon Sep 17 00:00:00 2001
From: Bruce Momjian
Date: Tue, 4 Mar 2008 01:33:32 +0000
Subject: [PATCH] Add ideas for concurrent pg_dump and pg_restore:
< * pg_dump
> * pg_dump / pg_restore
> o Allow pg_dump to utilize multiple CPUs and I/O channels by dumping
> multiple objects simultaneously
>
> The difficulty with this is getting multiple dump processes to
> produce a single dump output file.
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php
>
> o Allow pg_restore to utilize multiple CPUs and I/O channels by
> restoring multiple objects simultaneously
>
> This might require a pg_restore flag to indicate how many
> simultaneous operations should be performed. Only pg_dump's
> -Fc format has the necessary dependency information.
>
> o To better utilize resources, restore data, primary keys, and
> indexes for a single table before restoring the next table
>
> Hopefully this will allow the CPU-I/O load to be more uniform
> for simultaneous restores. The idea is to start data restores
> for several objects, and once the first object is done, to move
> on to its primary keys and indexes. Over time, simultaneous
> data loads and index builds will be running.
>
> o To better utilize resources, allow pg_restore to check foreign
> keys simultaneously, where possible
> o Allow pg_restore to create all indexes of a table
> concurrently, via a single heap scan
>
> This requires a pg_dump -Fc file because that format contains
> the required dependency information.
> http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
>
> o Allow pg_restore to load different parts of the COPY data
> simultaneously
< single heap scan, and have a restore of a pg_dump somehow use it
> single heap scan, and have pg_restore use it
< http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
---
doc/TODO | 41 +++++++++++++++++++++++++++++++++++++----
doc/src/FAQ/TODO.html | 38 +++++++++++++++++++++++++++++++++-----
2 files changed, 70 insertions(+), 9 deletions(-)
diff --git a/doc/TODO b/doc/TODO
index 7d5d180ad91..2bdfa6ed31f 100644
--- a/doc/TODO
+++ b/doc/TODO
@@ -1,7 +1,7 @@
PostgreSQL TODO List
====================
Current maintainer: Bruce Momjian (bruce@momjian.us)
-Last updated: Mon Mar 3 16:26:04 EST 2008
+Last updated: Mon Mar 3 20:33:10 EST 2008
The most recent version of this document can be viewed at
http://www.postgresql.org/docs/faqs.TODO.html.
@@ -819,7 +819,7 @@ Clients
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php
-* pg_dump
+* pg_dump / pg_restore
o %Add dumping of comments on index columns and composite type columns
o %Add full object name to the tag field. eg. for operators we need
'=(integer, integer)', instead of just '='.
@@ -838,6 +838,40 @@ Clients
COMMENT ON CURRENT DATABASE.
o Remove unnecessary function pointer abstractions in pg_dump source
code
+ o Allow pg_dump to utilize multiple CPUs and I/O channels by dumping
+ multiple objects simultaneously
+
+ The difficulty with this is getting multiple dump processes to
+ produce a single dump output file.
+ http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php
+
+ o Allow pg_restore to utilize multiple CPUs and I/O channels by
+ restoring multiple objects simultaneously
+
+ This might require a pg_restore flag to indicate how many
+ simultaneous operations should be performed. Only pg_dump's
+ -Fc format has the necessary dependency information.
+
+ o To better utilize resources, restore data, primary keys, and
+ indexes for a single table before restoring the next table
+
+ Hopefully this will allow the CPU-I/O load to be more uniform
+ for simultaneous restores. The idea is to start data restores
+ for several objects, and once the first object is done, to move
+ on to its primary keys and indexes. Over time, simultaneous
+ data loads and index builds will be running.
+
+ o To better utilize resources, allow pg_restore to check foreign
+ keys simultaneously, where possible
+ o Allow pg_restore to create all indexes of a table
+ concurrently, via a single heap scan
+
+ This requires a pg_dump -Fc file because that format contains
+ the required dependency information.
+ http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
+
+ o Allow pg_restore to load different parts of the COPY data
+ simultaneously
* ecpg
@@ -967,9 +1001,8 @@ Indexes
downtime.
* Allow multiple indexes to be created concurrently, ideally via a
- single heap scan, and have a restore of a pg_dump somehow use it
+ single heap scan, and have pg_restore use it
- http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
* Inheritance
diff --git a/doc/src/FAQ/TODO.html b/doc/src/FAQ/TODO.html
index 34fd7bf93eb..9564d099216 100644
--- a/doc/src/FAQ/TODO.html
+++ b/doc/src/FAQ/TODO.html
@@ -8,7 +8,7 @@
Current maintainer: Bruce Momjian (bruce@momjian.us)
-Last updated: Mon Mar 3 16:26:04 EST 2008
+Last updated: Mon Mar 3 20:33:10 EST 2008
The most recent version of this document can be viewed at
http://www.postgresql.org/docs/faqs.TODO.html.
@@ -727,7 +727,7 @@ first. There is also a developer's wiki at
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php
- pg_dump
+ pg_dump / pg_restore
- %Add dumping of comments on index columns and composite type columns
- %Add full object name to the tag field. eg. for operators we need
@@ -747,6 +747,36 @@ first. There is also a developer's wiki at
COMMENT ON CURRENT DATABASE.
- Remove unnecessary function pointer abstractions in pg_dump source
code
+
- Allow pg_dump to utilize multiple CPUs and I/O channels by dumping
+ multiple objects simultaneously
+
The difficulty with this is getting multiple dump processes to
+ produce a single dump output file.
+ http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php
+
+ - Allow pg_restore to utilize multiple CPUs and I/O channels by
+ restoring multiple objects simultaneously
+
This might require a pg_restore flag to indicate how many
+ simultaneous operations should be performed. Only pg_dump's
+ -Fc format has the necessary dependency information.
+
+ - To better utilize resources, restore data, primary keys, and
+ indexes for a single table before restoring the next table
+
Hopefully this will allow the CPU-I/O load to be more uniform
+ for simultaneous restores. The idea is to start data restores
+ for several objects, and once the first object is done, to move
+ on to its primary keys and indexes. Over time, simultaneous
+ data loads and index builds will be running.
+
+ - To better utilize resources, allow pg_restore to check foreign
+ keys simultaneously, where possible
+
- Allow pg_restore to create all indexes of a table
+ concurrently, via a single heap scan
+
This requires a pg_dump -Fc file because that format contains
+ the required dependency information.
+ http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
+
+ - Allow pg_restore to load different parts of the COPY data
+ simultaneously
ecpg
@@ -860,9 +890,7 @@ first. There is also a developer's wiki at
downtime.
Allow multiple indexes to be created concurrently, ideally via a
- single heap scan, and have a restore of a pg_dump somehow use it
- http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
-
+ single heap scan, and have pg_restore use it
Inheritance
- Allow inherited tables to inherit indexes, UNIQUE constraints,
--
2.39.5