From a273d393b7f9b66890d9bfa55982ac4c27d729c9 Mon Sep 17 00:00:00 2001 From: Bruce Momjian Date: Tue, 4 Mar 2008 01:33:32 +0000 Subject: [PATCH] Add ideas for concurrent pg_dump and pg_restore: < * pg_dump > * pg_dump / pg_restore > o Allow pg_dump to utilize multiple CPUs and I/O channels by dumping > multiple objects simultaneously > > The difficulty with this is getting multiple dump processes to > produce a single dump output file. > http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php > > o Allow pg_restore to utilize multiple CPUs and I/O channels by > restoring multiple objects simultaneously > > This might require a pg_restore flag to indicate how many > simultaneous operations should be performed. Only pg_dump's > -Fc format has the necessary dependency information. > > o To better utilize resources, restore data, primary keys, and > indexes for a single table before restoring the next table > > Hopefully this will allow the CPU-I/O load to be more uniform > for simultaneous restores. The idea is to start data restores > for several objects, and once the first object is done, to move > on to its primary keys and indexes. Over time, simultaneous > data loads and index builds will be running. > > o To better utilize resources, allow pg_restore to check foreign > keys simultaneously, where possible > o Allow pg_restore to create all indexes of a table > concurrently, via a single heap scan > > This requires a pg_dump -Fc file because that format contains > the required dependency information. > http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php > > o Allow pg_restore to load different parts of the COPY data > simultaneously < single heap scan, and have a restore of a pg_dump somehow use it > single heap scan, and have pg_restore use it < http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php --- doc/TODO | 41 +++++++++++++++++++++++++++++++++++++---- doc/src/FAQ/TODO.html | 38 +++++++++++++++++++++++++++++++++----- 2 files changed, 70 insertions(+), 9 deletions(-) diff --git a/doc/TODO b/doc/TODO index 7d5d180ad91..2bdfa6ed31f 100644 --- a/doc/TODO +++ b/doc/TODO @@ -1,7 +1,7 @@ PostgreSQL TODO List ==================== Current maintainer: Bruce Momjian (bruce@momjian.us) -Last updated: Mon Mar 3 16:26:04 EST 2008 +Last updated: Mon Mar 3 20:33:10 EST 2008 The most recent version of this document can be viewed at http://www.postgresql.org/docs/faqs.TODO.html. @@ -819,7 +819,7 @@ Clients http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php -* pg_dump +* pg_dump / pg_restore o %Add dumping of comments on index columns and composite type columns o %Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='. @@ -838,6 +838,40 @@ Clients COMMENT ON CURRENT DATABASE. o Remove unnecessary function pointer abstractions in pg_dump source code + o Allow pg_dump to utilize multiple CPUs and I/O channels by dumping + multiple objects simultaneously + + The difficulty with this is getting multiple dump processes to + produce a single dump output file. + http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php + + o Allow pg_restore to utilize multiple CPUs and I/O channels by + restoring multiple objects simultaneously + + This might require a pg_restore flag to indicate how many + simultaneous operations should be performed. Only pg_dump's + -Fc format has the necessary dependency information. + + o To better utilize resources, restore data, primary keys, and + indexes for a single table before restoring the next table + + Hopefully this will allow the CPU-I/O load to be more uniform + for simultaneous restores. The idea is to start data restores + for several objects, and once the first object is done, to move + on to its primary keys and indexes. Over time, simultaneous + data loads and index builds will be running. + + o To better utilize resources, allow pg_restore to check foreign + keys simultaneously, where possible + o Allow pg_restore to create all indexes of a table + concurrently, via a single heap scan + + This requires a pg_dump -Fc file because that format contains + the required dependency information. + http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php + + o Allow pg_restore to load different parts of the COPY data + simultaneously * ecpg @@ -967,9 +1001,8 @@ Indexes downtime. * Allow multiple indexes to be created concurrently, ideally via a - single heap scan, and have a restore of a pg_dump somehow use it + single heap scan, and have pg_restore use it - http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php * Inheritance diff --git a/doc/src/FAQ/TODO.html b/doc/src/FAQ/TODO.html index 34fd7bf93eb..9564d099216 100644 --- a/doc/src/FAQ/TODO.html +++ b/doc/src/FAQ/TODO.html @@ -8,7 +8,7 @@

PostgreSQL TODO List

Current maintainer: Bruce Momjian (bruce@momjian.us)
-Last updated: Mon Mar 3 16:26:04 EST 2008 +Last updated: Mon Mar 3 20:33:10 EST 2008

The most recent version of this document can be viewed at
http://www.postgresql.org/docs/faqs.TODO.html. @@ -727,7 +727,7 @@ first. There is also a developer's wiki at

http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php

-
  • pg_dump +
  • pg_dump / pg_restore
  • ecpg