-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make index creation idempotent #2197
base: main
Are you sure you want to change the base?
Conversation
josephschorr
commented
Jan 8, 2025
- Make each non-transactional migration step contain only a single migration operation. This ensures that if the operation fails, we don't end up "halfway" into a migration step. In the future, we'll enforce this via testing.
- Change how concurrent indexes are created. See: https://www.shayon.dev/post/2024/225/stop-relying-on-if-not-exists-for-concurrent-index-creation-in-postgresql/ for why this is important
fa1c4fe
to
f6f9dac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach seems sensible, although we cannot enforce migration designs like this moving forward.
I don't see why this has to be done in different migrations if the column creation is also designed to be idempotent (e.g., checking the catalog to see if the column was already created). Ideally, we have self-contained migrations, but I don't strongly oppose doing this—it would also work.
Example of checking if column exists:
DO $$
BEGIN
-- Add column if it doesn't exist
IF NOT EXISTS (
SELECT 1
FROM information_schema.columns
WHERE table_name = 'your_table'
AND column_name = 'new_column'
) THEN
EXECUTE 'ALTER TABLE your_table ADD COLUMN new_column type';
END IF;
END $$;
I also wondered why we weren't running those in the same transaction (supported via the migration framework), but it's impossible to run CREATE INDEX CONCURRENTLY
transactionally with another statement.
There is a bunch of caveats with CREATE INDEX CONCURRENTLY
.
One issue I see is that the idempotent drop/create index can enter an unrecoverable (and potentially disruptive by sustaining high-load for an extended period?) state if timeouts (migration timeout, statement timeout) are hit.
- migration runs
- drop index
- create index
- index creation fails due to SQL statement timeout or migration timeout, marked as invalid
- migration terminates with an error
- GOTO 1 (operator retries migration)
Coding defensively would be complex, so I suggest we log the error and provide some advice in the log (e.g., Please consider increasing your statement timeout and migration timeout using flags...
) and let a human intervene.
I also did some research, and there is no scenario in which a session terminates and index creation continues, so we are guaranteed that it will always end up in an invalid
state if it has not succeeded (it will never run detached).
f6f9dac
to
af757e4
Compare
Updated as discussed |
af757e4
to
34d5409
Compare
1) Make each non-transactional migration step contain only a single migration operation. This ensures that if the operation fails, we don't end up "halfway" into a migration step. In the future, we'll enforce this via testing. 2) Change how concurrent indexes are created. See: https://www.shayon.dev/post/2024/225/stop-relying-on-if-not-exists-for-concurrent-index-creation-in-postgresql/ for why this is important
34d5409
to
ec03842
Compare