mirror of https://github.com/postgres/postgres
616 lines
25 KiB
Plaintext
616 lines
25 KiB
Plaintext
|
From pgsql-hackers-owner+M59479@postgresql.org Thu Sep 30 15:55:23 2004
|
||
|
Return-path: <pgsql-hackers-owner+M59479@postgresql.org>
|
||
|
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
|
||
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i8UJtHw26165
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 15:55:19 -0400 (EDT)
|
||
|
Received: from localhost (unknown [200.46.204.144])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id A4BDD32A219
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 20:55:10 +0100 (BST)
|
||
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
||
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
||
|
with ESMTP id 24195-05 for <pgman@candle.pha.pa.us>;
|
||
|
Thu, 30 Sep 2004 19:55:08 +0000 (GMT)
|
||
|
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id 537C532A216
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 20:55:10 +0100 (BST)
|
||
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
||
|
Received: from localhost (unknown [200.46.204.144])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id 333B932A1EF
|
||
|
for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 30 Sep 2004 20:49:20 +0100 (BST)
|
||
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
||
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
||
|
with ESMTP id 21793-04
|
||
|
for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
|
||
|
Thu, 30 Sep 2004 19:49:09 +0000 (GMT)
|
||
|
Received: from authenticity.encs.concordia.ca (authenticity-96.encs.concordia.ca [132.205.96.93])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id BEB6A32A156
|
||
|
for <pgsql-hackers@postgresql.org>; Thu, 30 Sep 2004 20:49:03 +0100 (BST)
|
||
|
Received: from haida.cs.concordia.ca (IDENT:mokhov@haida.cs.concordia.ca [132.205.64.45])
|
||
|
by authenticity.encs.concordia.ca (8.12.11/8.12.11) with ESMTP id i8UJn0Xe022202;
|
||
|
Thu, 30 Sep 2004 15:49:00 -0400
|
||
|
Date: Thu, 30 Sep 2004 15:49:00 -0400 (EDT)
|
||
|
From: "Serguei A. Mokhov" <mokhov@cs.concordia.ca>
|
||
|
To: pgsql-hackers@postgresql.org
|
||
|
cc: "Serguei A. Mokhov" <mokhov@cs.concordia.ca>
|
||
|
Subject: [HACKERS] pg_upgrade project: high-level design proposal of in-place upgrade
|
||
|
facility
|
||
|
Message-ID: <Pine.LNX.4.58.0409301532390.4685@haida.cs.concordia.ca>
|
||
|
MIME-Version: 1.0
|
||
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||
|
X-Virus-Scanned: by amavisd-new at hub.org
|
||
|
X-Mailing-List: pgsql-hackers
|
||
|
Precedence: bulk
|
||
|
Sender: pgsql-hackers-owner@postgresql.org
|
||
|
X-Virus-Scanned: by amavisd-new at hub.org
|
||
|
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
|
||
|
candle.pha.pa.us
|
||
|
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
|
||
|
version=2.61
|
||
|
Status: OR
|
||
|
|
||
|
Hello dear all,
|
||
|
|
||
|
[Please CC your replies to me as I am on the digest mode]
|
||
|
|
||
|
Here's finally a very high-level design proposal of the pg_upgrade feature
|
||
|
I was handwaiving a couple of weeks ago. Since, I am almost done with the
|
||
|
moving, I can allocate some time for this for 8.1/8.2.
|
||
|
|
||
|
If this topic is of interest to you, please read on until the very end
|
||
|
before flaming or bashing the ideas out. I had designed that thing and
|
||
|
kept updating (the design) more or less regularly, and also reflected some
|
||
|
issues from the nearby threads [1] and [2].
|
||
|
|
||
|
This design is very high-level at the moment and is not very detailed. I
|
||
|
will need to figure out more stuff as I go and design some aspects in
|
||
|
finer detail. I started to poke around asking for initdb-forcing code
|
||
|
paths in [3], but got no response so far. But I guess if the general idea
|
||
|
or, rather, ideas accepted I will insist on more information more
|
||
|
aggressively :) if I can't figure something out for myself.
|
||
|
|
||
|
[1] http://archives.postgresql.org/pgsql-hackers/2004-09/msg00000.php
|
||
|
[2] http://archives.postgresql.org/pgsql-hackers/2004-09/msg00382.php
|
||
|
[3] http://archives.postgresql.org/pgsql-hackers/2004-08/msg01594.php
|
||
|
|
||
|
Comments are very welcome, especially _*CONSTRUCTIVE*_...
|
||
|
|
||
|
Thank you, and now sit back and read...
|
||
|
|
||
|
CONTENTS:
|
||
|
=========
|
||
|
|
||
|
1. The Need
|
||
|
1. Utilities and User's View of the pg_upgrade Feature
|
||
|
2. Storage Management
|
||
|
- Storage Managers and the smgr API
|
||
|
3. Source Code Maintenance Aspects
|
||
|
2. The Upgrade Sequence
|
||
|
4. Proposed Implementation Plan
|
||
|
- initdb() API
|
||
|
- upgrade API
|
||
|
|
||
|
|
||
|
1. The Need
|
||
|
-----------
|
||
|
|
||
|
It's been a problem for PG for quite awhile now to have a less painful
|
||
|
upgrade procedure with every new revision of PostgreSQL, so the
|
||
|
dump/restore sequence is required. That can take a while for a production
|
||
|
DB, while keeping it offline. The new replication-related solutions, such
|
||
|
as Slony I, pg_pool, and others can remedy the problem somewhat, but
|
||
|
require to roughly double the storage requirements of a given database
|
||
|
while replicating from the older server to a newer one.
|
||
|
|
||
|
The proposed implementation of an in-server pg_upgrade facility attempts
|
||
|
to address both issues at the same time -- a possibility to keep the
|
||
|
server running and upgrading lazily w/o doubling the storage requirements
|
||
|
(there will be some extra disk space taken, but far from doubling the
|
||
|
size). The in-process upgrade will not take much of down time and won't
|
||
|
require that much memory/disk/network resources as replication solutions
|
||
|
do.
|
||
|
|
||
|
|
||
|
Prerequisites
|
||
|
-------------
|
||
|
|
||
|
Ideally, the (maybe not so anymore) ambitious goal is to simply be able to
|
||
|
"drop in" the new binaries of the new server and kick off on the older
|
||
|
version of data files. I think is this feasible now a lot more than before
|
||
|
since we have those things available, which should ease up the
|
||
|
implementation:
|
||
|
|
||
|
- bgwriter
|
||
|
- pg_autovacuum (the one to be integrated into the backend in 8.1)
|
||
|
- smgr API for pluggable storage managers
|
||
|
- initdb in C
|
||
|
- ...
|
||
|
|
||
|
initdb in C, bgwriter and pg_autovacuum, and pluggable storage manager
|
||
|
have made the possibility of creation of the Upgrade Subsystem for
|
||
|
PostgreSQL to be something more reasonable, complete, feasible, and sane
|
||
|
to a point.
|
||
|
|
||
|
|
||
|
Utilities and the User's (DBA) View of the Feature
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Two instances exist:
|
||
|
|
||
|
pg_upgrade (in C)
|
||
|
|
||
|
A standalone utility to upgrade the binary on-disk format from one
|
||
|
version to another when the database is offline.
|
||
|
We should always have this as an option.
|
||
|
pg_upgrade will accept sub/super set of pg_dump(all)/pg_restore
|
||
|
options that do not require a connection. I haven't
|
||
|
thought through this in detail yet.
|
||
|
|
||
|
pg_autoupgrade
|
||
|
|
||
|
a postgres subprocess, modeled after bgwriter and pg_autovacuum
|
||
|
daemons. This will work when the database system is running
|
||
|
on old data directory, and lazily converting relations to the new
|
||
|
format.
|
||
|
|
||
|
pg_autoupgrade daemon can be triggered by the following events in addition
|
||
|
to the lazy upgrade process:
|
||
|
|
||
|
"SQL" level: UPGRADE <ALL | relation_name [, relation_name]> [NOW | time]
|
||
|
|
||
|
While the database won't be offline running over older database files,
|
||
|
SELECT/read-only queries would be allowed using older storage managers*.
|
||
|
Any write operation on old data will act using write-invalidate approach
|
||
|
that will force the upgrade the affected relations to the new format to be
|
||
|
scheduled after the relation-in-progress.
|
||
|
|
||
|
(* See the "Storage Management" section.)
|
||
|
|
||
|
Availability of the relations while upgrade is in progress is likely to be
|
||
|
the same as in VACUUM FULL for that relation, i.e. the entire relation is
|
||
|
locked until the upgrade is complete. Maybe we could optimize that by
|
||
|
locking only particular pages of relations, I have to figure that out.
|
||
|
|
||
|
The upgrade of indices can be done using REINDEX, which seems far less
|
||
|
complicated than trying to convert its on-disk representation. This has
|
||
|
to be done after the relation is converted. Alternatively, the index
|
||
|
upgrade can simply be done by "CREATE INDEX" after the upgrade of
|
||
|
relations.
|
||
|
|
||
|
The relations to be upgraded are ordered according to some priority, e.g.
|
||
|
system relations being first, then user-owned relations. System relations
|
||
|
upgrade is forced upon the postmaster startup, and then user relations are
|
||
|
processed lazily.
|
||
|
|
||
|
So, in a sense, pg_autoupgrade will act like a proxy choosing appropriate
|
||
|
storage manager (like a driver) between the new server and the old data
|
||
|
file upgrading them on-demand. For that purpose we might need to add a
|
||
|
pg_upgradeproxy to intercept backend requests and use appropriate storage
|
||
|
manager. There will be one proxy process per backend.
|
||
|
|
||
|
|
||
|
Storage Management
|
||
|
==================
|
||
|
|
||
|
Somebody has made a possibility to plug a different storage manager in
|
||
|
postgres and we even had two of them at some point . for the magnetic disk
|
||
|
and the main memory. The main memory one is gone, but the smgr API is
|
||
|
still there. Some were dubious why we would ever need another third-party
|
||
|
storage manager, but here I propose to "plug in" storage managers from the
|
||
|
older Postgres versions itself! Here is where the pluggable storage
|
||
|
manager API would be handy once fully resurrected. Instead of trying to
|
||
|
plug some third party storage managers it will primarily be used by the
|
||
|
storage managers of different versions of Postgres.
|
||
|
|
||
|
We can take the storage manager code from the past maintenance releases,
|
||
|
namely 6.5.3, 7.0.3, 7.1.3, 7.2.5, 7.3.7, 7.4.5, and 8.0, and arrange them
|
||
|
in appropriate fashion and have them implement the API properly. Anyone
|
||
|
can contribute a storage manager as they see fit, there's no need to get
|
||
|
them all at once. As a trial implementation I will try to do the last
|
||
|
three or four maybe.
|
||
|
|
||
|
|
||
|
Where to put relations being upgraded?
|
||
|
--------------------------------------
|
||
|
|
||
|
At the beginning of the upgrade process if pg detects the old version of
|
||
|
data files, it moves them under $PGDATA/<ver>, and keeps the old relations
|
||
|
there until upgraded. The relations to be upgraded will be kept in the
|
||
|
pg_upgrade_catalog. Once all relations upgraded, the <ver> directory is
|
||
|
removed and the auto and proxy processes are shut down. The contents of
|
||
|
the pg_upgrade_catalog emptied. The only issue remains is how to deal
|
||
|
with tablespaces (or LOCATION in 7.* releases) elsewhere .- this can
|
||
|
probably be addressed in the similar fashion, but having a
|
||
|
/my/tablespace/<ver> directory.
|
||
|
|
||
|
Source Code Maintenance
|
||
|
=======================
|
||
|
|
||
|
Now, after the above some of you may get scared on the amount of similar
|
||
|
code to possibly maintain in all those storage managers, but in reality
|
||
|
they would require as much maintenance as the corresponding releases do
|
||
|
get back-patched in that code area, and some are not being maintained for
|
||
|
quite some time already. Plus, I should be around to maintain it, should
|
||
|
this become realized.
|
||
|
|
||
|
Release-time Maintenance
|
||
|
------------------------
|
||
|
|
||
|
For maintenance of pg_upgrade itself, one will have to fork out a new
|
||
|
storage manager from the previous stable release and "register" it within
|
||
|
the system. Alternatively, the new storage manager can be forked when the
|
||
|
new release cycle begins. Additionally, a pg_upgrade version has to be
|
||
|
added implementing the API steps outlined in the pg_upgrade API section.
|
||
|
|
||
|
|
||
|
Implementation Steps
|
||
|
====================
|
||
|
|
||
|
To materialize the above idea, I'd proceed as follows:
|
||
|
|
||
|
*) Provide the initdb() API (quick)
|
||
|
|
||
|
*) Resurrect the pluggable storage manager API to be usable for the
|
||
|
purpose.
|
||
|
|
||
|
*) Document it
|
||
|
|
||
|
*) Implement pg_upgrade API for 8.0 and 7.4.5.
|
||
|
|
||
|
*) Extract 8.0 and 7.4.5 storage managers and have them implement the API
|
||
|
as a proof of concept. Massage the API as needed.
|
||
|
|
||
|
*) Document the process of adding new storage managers and pg_upgrade
|
||
|
drivers.
|
||
|
|
||
|
*) Extract other versions storage managers.
|
||
|
|
||
|
|
||
|
pg_upgrade sequence
|
||
|
-------------------
|
||
|
|
||
|
pg_upgrade API for the steps below to update for the next release.
|
||
|
|
||
|
What to do with WAL?? Maybe upgrade can simply be done using WAL replay
|
||
|
with old WAL manager? Not, fully, because not everything is in WAL, but
|
||
|
some WAL recovery maybe needed in case the server was not shutdown cleanly
|
||
|
before the upgrade.
|
||
|
|
||
|
pg_upgrade will proceed as follows:
|
||
|
|
||
|
- move PGDATA to PGDATA/<major pg version>
|
||
|
- move tablespaces likewise
|
||
|
- optional recovery from WAL in case old server was not shutdown properly
|
||
|
-? Shall I upgrade PITR logs of 8.x??? So one can recover to a
|
||
|
point-in-time in the upgraded database?
|
||
|
- CLUSTER all old data
|
||
|
- ANALYZE all old data
|
||
|
- initdb() new system catalogs
|
||
|
- Merge in modifications from old system catalogs
|
||
|
- upgrade schemas/users
|
||
|
-- variations
|
||
|
- upgrade user relations
|
||
|
|
||
|
Upgrade API:
|
||
|
------------
|
||
|
|
||
|
First draft, to be refined multiple times, but to convey the ideas behind:
|
||
|
|
||
|
moveData()
|
||
|
movePGData()
|
||
|
moveTablespaces() 8.0+
|
||
|
moveDbLocation() < 8.0
|
||
|
|
||
|
preliminaryRecovery()
|
||
|
- WAL??
|
||
|
- PITR 8.0+??
|
||
|
|
||
|
preliminaryCleanup()
|
||
|
CLUSTER -- recover some dead space
|
||
|
ANALYZE -- gives us stats
|
||
|
|
||
|
upgradeSystemInfo()
|
||
|
initdb()
|
||
|
mergeOldCatalogs()
|
||
|
mergeOldTemplates()
|
||
|
|
||
|
upgradeUsers()
|
||
|
|
||
|
upgradeSchemas()
|
||
|
- > 7.2, else NULL
|
||
|
|
||
|
upgradeUserRelations()
|
||
|
upgradeIndices()
|
||
|
DROP/CREATE
|
||
|
|
||
|
upgradeInit()
|
||
|
{
|
||
|
|
||
|
}
|
||
|
|
||
|
The main body in pseudocode:
|
||
|
|
||
|
upgradeLoop()
|
||
|
{
|
||
|
moveData();
|
||
|
preliminaryRecovery();
|
||
|
preliminaryCleanup();
|
||
|
upgradeSystemInfo();
|
||
|
upgradeUsers();
|
||
|
upgradeSchemas();
|
||
|
upgradeUserRelations();
|
||
|
}
|
||
|
|
||
|
Something along these lines the API would be:
|
||
|
|
||
|
typedef struct t_upgrade
|
||
|
{
|
||
|
bool (*moveData) (void);
|
||
|
bool (*preliminaryRecovery) (void); /* may be NULL */
|
||
|
bool (*preliminaryCleanup) (void); /* may be NULL */
|
||
|
bool (*upgradeSystemInfo) (void); /* may be NULL */
|
||
|
bool (*upgradeUsers) (void); /* may be NULL */
|
||
|
bool (*upgradeSchemas) (void); /* may be NULL */
|
||
|
bool (*upgradeUserRelations) (void); /* may be NULL */
|
||
|
} t_upgrade;
|
||
|
|
||
|
|
||
|
The above sequence is executed by either pg_upgrade utility uninterrupted
|
||
|
or by the pg_autoupgrade daemon. In the former the upgrade priority is
|
||
|
simply by OID, in the latter also, but can be overridden by the user using
|
||
|
the UPGRADE command to schedule relations upgrade, write operation can
|
||
|
also change such schedule, with user's selected choice to be first. The
|
||
|
more write requests a relation receives while in the upgrade queue, its
|
||
|
priority increases; thus, the relation with most hits is on top. In case
|
||
|
of tie, OID is the decision mark.
|
||
|
|
||
|
Some issues to look into:
|
||
|
|
||
|
- catalog merger
|
||
|
- a crash in the middle of upgrade
|
||
|
- PITR logs for 8.x+
|
||
|
- ...
|
||
|
|
||
|
Flames and Handwaiving
|
||
|
----------------------
|
||
|
|
||
|
Okay, flame is on, but before you flame, mind you, this is a very initial
|
||
|
version of the design. Some of the ideas may seem far fetched, the
|
||
|
contents may seem messy, but I believe it's now more doable than ever and
|
||
|
I am willing to put effort in it for the next release or two and then
|
||
|
maintain it afterwards. It's not going to be done in one shot maybe, but
|
||
|
incrementally, using input, feedback, and hints from you, guys.
|
||
|
|
||
|
Thank you for reading till this far :-) I.d like to hear from you if any
|
||
|
of this made sense to you.
|
||
|
|
||
|
Truly yours,
|
||
|
|
||
|
--
|
||
|
Serguei A. Mokhov | /~\ The ASCII
|
||
|
Computer Science Department | \ / Ribbon Campaign
|
||
|
Concordia University | X Against HTML
|
||
|
Montreal, Quebec, Canada | / \ Email!
|
||
|
|
||
|
---------------------------(end of broadcast)---------------------------
|
||
|
TIP 5: Have you checked our extensive FAQ?
|
||
|
|
||
|
http://www.postgresql.org/docs/faqs/FAQ.html
|
||
|
|
||
|
From pgsql-hackers-owner+M59486@postgresql.org Thu Sep 30 16:39:54 2004
|
||
|
Return-path: <pgsql-hackers-owner+M59486@postgresql.org>
|
||
|
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
|
||
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i8UKdrw02740
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 16:39:53 -0400 (EDT)
|
||
|
Received: from localhost (unknown [200.46.204.144])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id EF25F329E3B
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 21:39:48 +0100 (BST)
|
||
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
||
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
||
|
with ESMTP id 38456-02 for <pgman@candle.pha.pa.us>;
|
||
|
Thu, 30 Sep 2004 20:39:46 +0000 (GMT)
|
||
|
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id 88673329C6B
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 21:39:48 +0100 (BST)
|
||
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
||
|
Received: from localhost (unknown [200.46.204.144])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id AA522329DAE
|
||
|
for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 30 Sep 2004 21:37:43 +0100 (BST)
|
||
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
||
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
||
|
with ESMTP id 38130-02
|
||
|
for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
|
||
|
Thu, 30 Sep 2004 20:37:39 +0000 (GMT)
|
||
|
Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id 846E9329C63
|
||
|
for <pgsql-hackers@postgresql.org>; Thu, 30 Sep 2004 21:37:39 +0100 (BST)
|
||
|
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
|
by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id i8UKa3jk025254;
|
||
|
Thu, 30 Sep 2004 16:36:03 -0400 (EDT)
|
||
|
To: "Serguei A. Mokhov" <mokhov@cs.concordia.ca>
|
||
|
cc: pgsql-hackers@postgresql.org
|
||
|
Subject: Re: [HACKERS] pg_upgrade project: high-level design proposal of in-place upgrade facility
|
||
|
In-Reply-To: <Pine.LNX.4.58.0409301532390.4685@haida.cs.concordia.ca>
|
||
|
References: <Pine.LNX.4.58.0409301532390.4685@haida.cs.concordia.ca>
|
||
|
Comments: In-reply-to "Serguei A. Mokhov" <mokhov@cs.concordia.ca>
|
||
|
message dated "Thu, 30 Sep 2004 15:49:00 -0400"
|
||
|
Date: Thu, 30 Sep 2004 16:36:02 -0400
|
||
|
Message-ID: <25253.1096576562@sss.pgh.pa.us>
|
||
|
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
|
X-Virus-Scanned: by amavisd-new at hub.org
|
||
|
X-Mailing-List: pgsql-hackers
|
||
|
Precedence: bulk
|
||
|
Sender: pgsql-hackers-owner@postgresql.org
|
||
|
X-Virus-Scanned: by amavisd-new at hub.org
|
||
|
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
|
||
|
candle.pha.pa.us
|
||
|
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
|
||
|
version=2.61
|
||
|
Status: OR
|
||
|
|
||
|
"Serguei A. Mokhov" <mokhov@cs.concordia.ca> writes:
|
||
|
> Comments are very welcome, especially _*CONSTRUCTIVE*_...
|
||
|
|
||
|
This is fundamentally wrong, because you are assigning the storage
|
||
|
manager functionality that it does not have. In particular, the
|
||
|
storage manager knows nothing of the contents or format of the files
|
||
|
it is managing, and so you can't realistically expect to use the smgr
|
||
|
switch as a way to support access to tables with different internal
|
||
|
formats. The places that change in interesting ways across versions
|
||
|
are usually far above the smgr switch.
|
||
|
|
||
|
I don't believe in the idea of incremental "lazy" upgrades very much
|
||
|
either. It certainly will not work on the system catalogs --- you have
|
||
|
to convert those in a big-bang fashion, because how are you going to
|
||
|
find the other stuff otherwise? And the real problem with it IMHO is
|
||
|
that if something goes wrong partway through the process, you're in
|
||
|
deep trouble because you have no way to back out. You can't just revert
|
||
|
to the old version because it won't understand your data, and your old
|
||
|
backups that are compatible with it are now out of date. If there are
|
||
|
going to be any problems, you really need to find out about them
|
||
|
immediately while your old backups are still current, not in a "lazy"
|
||
|
fashion.
|
||
|
|
||
|
The design we'd pretty much all bought into six months ago involved
|
||
|
being able to do in-place upgrades when the format/contents of user
|
||
|
relations and indexes is not changing. All you'd have to do is dump
|
||
|
and restore the schema data (system catalogs) which is a reasonably
|
||
|
short process even on a large DB, so the big-bang nature of the
|
||
|
conversion isn't a problem. Admittedly this will not work for every
|
||
|
single upgrade, but we had agreed that we could schedule upgrades
|
||
|
so that the majority of releases do not change user data. Historically
|
||
|
that's been mostly true anyway, even without any deliberate attempt to
|
||
|
group user-data-changing features together.
|
||
|
|
||
|
I think the last major discussion about it started here:
|
||
|
http://archives.postgresql.org/pgsql-hackers/2003-12/msg00379.php
|
||
|
(I got distracted by other stuff and never did the promised work,
|
||
|
but I still think the approach is sound.)
|
||
|
|
||
|
regards, tom lane
|
||
|
|
||
|
---------------------------(end of broadcast)---------------------------
|
||
|
TIP 2: you can get off all lists at once with the unregister command
|
||
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
||
|
|
||
|
From pgsql-hackers-owner+M59497@postgresql.org Thu Sep 30 17:44:44 2004
|
||
|
Return-path: <pgsql-hackers-owner+M59497@postgresql.org>
|
||
|
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
|
||
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i8ULihw11377
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 17:44:43 -0400 (EDT)
|
||
|
Received: from localhost (unknown [200.46.204.144])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id D0A6B329E2E
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 22:44:38 +0100 (BST)
|
||
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
||
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
||
|
with ESMTP id 55636-04 for <pgman@candle.pha.pa.us>;
|
||
|
Thu, 30 Sep 2004 21:44:36 +0000 (GMT)
|
||
|
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id 6ED67329DFC
|
||
|
for <pgman@candle.pha.pa.us>; Thu, 30 Sep 2004 22:44:38 +0100 (BST)
|
||
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
||
|
Received: from localhost (unknown [200.46.204.144])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id 040D2329E2E
|
||
|
for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 30 Sep 2004 22:42:24 +0100 (BST)
|
||
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
||
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
||
|
with ESMTP id 55767-04
|
||
|
for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
|
||
|
Thu, 30 Sep 2004 21:42:18 +0000 (GMT)
|
||
|
Received: from authenticity.encs.concordia.ca (authenticity-96.encs.concordia.ca [132.205.96.93])
|
||
|
by svr1.postgresql.org (Postfix) with ESMTP id E3A4D329DFC
|
||
|
for <pgsql-hackers@postgresql.org>; Thu, 30 Sep 2004 22:42:19 +0100 (BST)
|
||
|
Received: from haida.cs.concordia.ca (IDENT:mokhov@haida.cs.concordia.ca [132.205.64.45])
|
||
|
by authenticity.encs.concordia.ca (8.12.11/8.12.11) with ESMTP id i8ULgJrP001049;
|
||
|
Thu, 30 Sep 2004 17:42:19 -0400
|
||
|
Date: Thu, 30 Sep 2004 17:42:19 -0400 (EDT)
|
||
|
From: "Serguei A. Mokhov" <mokhov@cs.concordia.ca>
|
||
|
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
|
cc: pgsql-hackers@postgresql.org
|
||
|
Subject: Re: [HACKERS] pg_upgrade project: high-level design proposal of
|
||
|
In-Reply-To: <25253.1096576562@sss.pgh.pa.us>
|
||
|
Message-ID: <Pine.LNX.4.58.0409301725550.4685@haida.cs.concordia.ca>
|
||
|
References: <Pine.LNX.4.58.0409301532390.4685@haida.cs.concordia.ca>
|
||
|
<25253.1096576562@sss.pgh.pa.us>
|
||
|
MIME-Version: 1.0
|
||
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||
|
X-Virus-Scanned: by amavisd-new at hub.org
|
||
|
X-Mailing-List: pgsql-hackers
|
||
|
Precedence: bulk
|
||
|
Sender: pgsql-hackers-owner@postgresql.org
|
||
|
X-Virus-Scanned: by amavisd-new at hub.org
|
||
|
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
|
||
|
candle.pha.pa.us
|
||
|
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
|
||
|
version=2.61
|
||
|
Status: OR
|
||
|
|
||
|
On Thu, 30 Sep 2004, Tom Lane wrote:
|
||
|
|
||
|
> Date: Thu, 30 Sep 2004 16:36:02 -0400
|
||
|
>
|
||
|
> "Serguei A. Mokhov" <mokhov@cs.concordia.ca> writes:
|
||
|
> > Comments are very welcome, especially _*CONSTRUCTIVE*_...
|
||
|
>
|
||
|
> This is fundamentally wrong, because you are assigning the storage
|
||
|
> manager functionality that it does not have. In particular, the
|
||
|
|
||
|
Maybe, that's why I was asking of all init-db forcing paths, so I can go
|
||
|
level above smgr to upgrade stuff, let say in access/ and other parts. I
|
||
|
did ask that before and never got a reply. So the concept of "Storage
|
||
|
Managers" may and will go well beyond the smgt API. That's the design
|
||
|
refinement stage.
|
||
|
|
||
|
> I don't believe in the idea of incremental "lazy" upgrades very much
|
||
|
> either. It certainly will not work on the system catalogs --- you have
|
||
|
> to convert those in a big-bang fashion,
|
||
|
|
||
|
I never proposed to do that to system catalogs, on the contrary, I said
|
||
|
the system catalogs are to be upgraded upon the postmaster startup. only
|
||
|
user relations are upgraded lazily:
|
||
|
|
||
|
> The relations to be upgraded are ordered according to some priority,
|
||
|
> e.g. system relations being first, then user-owned relations. System
|
||
|
> relations upgrade is forced upon the postmaster startup, and then user
|
||
|
> relations are processed lazily.
|
||
|
|
||
|
So looks like we don't disagree here :)
|
||
|
|
||
|
> The design we'd pretty much all bought into six months ago involved
|
||
|
> being able to do in-place upgrades when the format/contents of user
|
||
|
> relations and indexes is not changing. All you'd have to do is dump and
|
||
|
> restore the schema data (system catalogs) which is a reasonably short
|
||
|
> process even on a large DB, so the big-bang nature of the conversion
|
||
|
> isn't a problem. Admittedly this will not work for every single
|
||
|
> upgrade, but we had agreed that we could schedule upgrades so that the
|
||
|
> majority of releases do not change user data. Historically that's been
|
||
|
> mostly true anyway, even without any deliberate attempt to group
|
||
|
> user-data-changing features together.
|
||
|
|
||
|
Annoyingly enough, they still do change.
|
||
|
|
||
|
> I think the last major discussion about it started here:
|
||
|
> http://archives.postgresql.org/pgsql-hackers/2003-12/msg00379.php
|
||
|
> (I got distracted by other stuff and never did the promised work,
|
||
|
> but I still think the approach is sound.)
|
||
|
|
||
|
I'll go over that discussion and maybe will combine useful ideas together.
|
||
|
I'll open a pgfoundry project to develop it there and then will submit for
|
||
|
evaluation UNLESS you reserved it for yourself, Tom, to fullfill the
|
||
|
promise... If anybody objects the pgfoundry idea to test the concepts,
|
||
|
I'll apply for a project there.
|
||
|
|
||
|
Thank you for the comments!
|
||
|
|
||
|
> regards, tom lane
|
||
|
>
|
||
|
|
||
|
--
|
||
|
Serguei A. Mokhov | /~\ The ASCII
|
||
|
Computer Science Department | \ / Ribbon Campaign
|
||
|
Concordia University | X Against HTML
|
||
|
Montreal, Quebec, Canada | / \ Email!
|
||
|
|
||
|
---------------------------(end of broadcast)---------------------------
|
||
|
TIP 7: don't forget to increase your free space map settings
|
||
|
|