-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TOAST Initialization vector #102
Conversation
Currently, we encrypt TOASTed data always with the offset 0. That is not secure. The offset should be unique. This commit replaces the 0 "offset" with TOAST's `va_valueid` (Unique ID of value within the TOAST table) during encryption. This `va_valueid` is available during the TOAST fetch which is crucial for the decryption. During the TOAST externalisation we insert a new tuple which shouldn't be encrypted as the backend will give this tuple to us during the TOAST fetch, hence fetched with non-TDE functions, besides TOAST data already encrypted. For that (insert non-encrypted tuple) I had to modify some TDE AM functions. `pg_tde_toast_save_datum()` was copied from the PG code and modified. Along with `toastrel_valueid_exists()` and `toastid_valueid_exists()`. Fix percona#101
...instead of bool Also more comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two questions:
- can we also guarantee that we won't have any overlaps with
valueid
? (if not, we can fix that after the changes in my next PR - we can do that later, the question if we have to do that or not) - wouldn't it be better to split it into two commits, one for copying the core code, one for the changes in them? That would make it clearer what are our changes
char data[TOAST_MAX_CHUNK_SIZE + VARHDRSZ]; | ||
/* ensure union is aligned well enough: */ | ||
int32 align_it; | ||
} chunk_data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this is upstream code, but any idea why align_it
is here? varlena
should be already at least 4 byte aligned on all supported platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an int (int 32 align_it) in the union ensures that the union variable will always start from the aligned address. Without this int (since all other members in the union and varlina structure are chars), the union variable can be placed at an unaligned starting address. align_it in the union makes sure the starting address is always aligned no matter the size of the union.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is legacy from times when varlena was 5bytes: https://github.com/postgres/postgres/blob/c67f6f2f573064c206044b44a73cdf0806dfbd4e/src/include/c.h#L411-L415. At least at times when this padding was introduced: postgres/postgres@c67f6f2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more to guard against stack variables of chunk_data to start at an aligned address, and the 'align_it' is still relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason I thought that varlena
is a pointer there, but of course it's not, it's clear then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks perfect now
Currently, we encrypt TOASTed data always with the offset 0. That is not secure. The offset should be unique.
This commit replaces the 0 "offset" with TOAST's
va_valueid
(Unique ID of value within the TOAST table) during encryption. Thisva_valueid
is available during the TOAST fetch which is crucial for the decryption.During the TOAST externalisation we insert a new tuple which shouldn't be encrypted as the backend will give this tuple to us during the TOAST fetch, hence fetched with non-TDE functions, besides TOAST data already encrypted. For that (insert non-encrypted tuple) I had to modify some TDE AM functions.
pg_tde_toast_save_datum()
was copied from the PG code and modified. Along withtoastrel_valueid_exists()
andtoastid_valueid_exists()
.Fix #101