Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TOAST Initialization vector #102

Merged
merged 2 commits into from
Jan 23, 2024
Merged

Fix TOAST Initialization vector #102

merged 2 commits into from
Jan 23, 2024

Conversation

dAdAbird
Copy link
Member

Currently, we encrypt TOASTed data always with the offset 0. That is not secure. The offset should be unique.

This commit replaces the 0 "offset" with TOAST's va_valueid (Unique ID of value within the TOAST table) during encryption. This va_valueid is available during the TOAST fetch which is crucial for the decryption.

During the TOAST externalisation we insert a new tuple which shouldn't be encrypted as the backend will give this tuple to us during the TOAST fetch, hence fetched with non-TDE functions, besides TOAST data already encrypted. For that (insert non-encrypted tuple) I had to modify some TDE AM functions.

pg_tde_toast_save_datum() was copied from the PG code and modified. Along with toastrel_valueid_exists() and toastid_valueid_exists().

Fix #101

Currently, we encrypt TOASTed data always with the offset 0. That is
not secure. The offset should be unique.

This commit replaces the 0 "offset" with TOAST's `va_valueid` (Unique
ID of value within the TOAST table) during encryption. This
`va_valueid` is available during the TOAST fetch which is crucial for
the decryption.

During the TOAST externalisation we insert a new tuple which shouldn't
be encrypted as the backend will give this tuple to us during the TOAST
fetch, hence fetched with non-TDE functions, besides TOAST data already
encrypted. For that (insert non-encrypted tuple) I had to modify some
TDE AM functions.

`pg_tde_toast_save_datum()` was copied from the PG code and modified.
Along with `toastrel_valueid_exists()` and `toastid_valueid_exists()`.

Fix percona#101
...instead of bool

Also more comments.
@dAdAbird dAdAbird requested a review from codeforall January 19, 2024 19:27
Copy link
Collaborator

@dutow dutow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two questions:

  • can we also guarantee that we won't have any overlaps with valueid? (if not, we can fix that after the changes in my next PR - we can do that later, the question if we have to do that or not)
  • wouldn't it be better to split it into two commits, one for copying the core code, one for the changes in them? That would make it clearer what are our changes

char data[TOAST_MAX_CHUNK_SIZE + VARHDRSZ];
/* ensure union is aligned well enough: */
int32 align_it;
} chunk_data;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this is upstream code, but any idea why align_it is here? varlena should be already at least 4 byte aligned on all supported platforms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having an int (int 32 align_it) in the union ensures that the union variable will always start from the aligned address. Without this int (since all other members in the union and varlina structure are chars), the union variable can be placed at an unaligned starting address. align_it in the union makes sure the starting address is always aligned no matter the size of the union.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is legacy from times when varlena was 5bytes: https://github.com/postgres/postgres/blob/c67f6f2f573064c206044b44a73cdf0806dfbd4e/src/include/c.h#L411-L415. At least at times when this padding was introduced: postgres/postgres@c67f6f2

Copy link
Contributor

@codeforall codeforall Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more to guard against stack variables of chunk_data to start at an aligned address, and the 'align_it' is still relevant.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I thought that varlena is a pointer there, but of course it's not, it's clear then.

Copy link
Contributor

@codeforall codeforall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks perfect now

@dAdAbird dAdAbird merged commit 5c08e3b into percona:main Jan 23, 2024
5 checks passed
@dAdAbird dAdAbird deleted the toast_iv branch January 23, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix TOAST Initialization vector
3 participants