-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pal sgx fast clock gettimeofday #1834
base: master
Are you sure you want to change the base?
Pal sgx fast clock gettimeofday #1834
Conversation
8e76328
to
9679f96
Compare
extern fast_clock_t g_fast_clock; | ||
|
||
int fast_clock__get_time(fast_clock_t* fast_clock, uint64_t* time_micros, bool force_new_timepoint); | ||
void fast_clock__get_timezone(const fast_clock_t* fast_clock, int* tz_minutewest, int* tz_dsttime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing implementation
void fast_clock__disable(fast_clock_t* fast_clock); | ||
|
||
#ifdef __cplusplus | ||
} /* extern int */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo - extern "C"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 11 files reviewed, 28 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jonathan-sha)
a discussion (no related file):
I just briefly looked at the PR and left some comments. Didn't go deep into the logic itself.
Please fix according to the comments, and then we could start the review in earnest.
pal/src/host/linux-sgx/enclave_ocalls.h
line 92 at r1 (raw file):
int ocall_futex(uint32_t* uaddr, int op, int val, uint64_t* timeout_us); int ocall_reset_time(uint64_t* microsec, uint64_t* tsc, int* tz_minuteswest, int* tz_dsttime);
Can we remove the tz_
args and handling from this PR? You don't use them here anyway, and it's harder to review.
pal/src/host/linux-sgx/enclave_ocalls.c
line 1810 at r1 (raw file):
*microsec_ptr = MAX(microsec, expected_microsec); if (tsc_ptr != NULL) { *tsc_ptr = COPY_UNTRUSTED_VALUE(&ocall_gettime_args->tsc);
Do you have some sanity checks on the returned-by-host tsc
value? It must be at least non-decreasing. We have a similar check for returned-by-host microsec
. This is to prevent attacks.
pal/src/host/linux-sgx/enclave_ocalls.c
line 1825 at r1 (raw file):
{ return ocall_reset_time(microsec_ptr, NULL, NULL, NULL); }
That's a bad security practice -- you added an additional attack surface (a new OCALL) for no good reason. Just expand existing ocall_gettime
to have an additional argument tsc_ptr
, and remove ocall_reset_time()
.
pal/src/host/linux-sgx/host_ocalls.c
line 608 at r1 (raw file):
struct timeval tv; struct timezone tz; unsigned long long tsc = __rdtsc();
We have get_tsc()
which does the same. Please remove x86intrin.h
pal/src/host/linux-sgx/pal_main.c
line 561 at r1 (raw file):
"not expose corresponding CPUID leaves). This degrades performance."); } }
Why did you remove print_warning_on_invariant_tsc()
?
pal/src/host/linux-sgx/pal_misc.c
line 319 at r1 (raw file):
/* hypervisor-specific CPUID leaf functions (0x40000000 - 0x400000FF) start here */ {.leaf = 0x40000000, .zero_subleaf = true, .cache = true}, /* CPUID Info */ {.leaf = 0x40000010, .zero_subleaf = true, .cache = true}, /* VMWare-style Timing Info */
These 2 leaves won't be needed now, right? We can remove them.
pal/src/host/linux-sgx/utils/fast_clock.h
line 2 at r1 (raw file):
#ifndef _FAST_CLOCK_H_ #define _FAST_CLOCK_H_
We use #pragma once
pal/src/host/linux-sgx/utils/fast_clock.h
line 9 at r1 (raw file):
#ifdef __cplusplus extern "C" { #endif
There is nothing C++ in our codebase, please remove these guards.
pal/src/host/linux-sgx/utils/fast_clock.h
line 13 at r1 (raw file):
enum fast_clock_state_e {
We have a different coding style (based on Google C/C++ coding style), see https://gramine.readthedocs.io/en/stable/devel/coding-style.html
pal/src/host/linux-sgx/utils/fast_clock.h
line 19 at r1 (raw file):
FC_STATE_INIT, FC_STATE_RDTSC_DISABLED,
I don't like the FC_
, it's not immediately obvious what it is.
Hm, is this enum really needed to be in the header? It looks like it can be embedded in the source file itself. Then it won't be visible in other compilation units, and I'd be fine with FC_
since it will become local-to-single-file.
pal/src/host/linux-sgx/utils/fast_clock.h
line 39 at r1 (raw file):
typedef union { #pragma pack(push, 1)
Why you need this bit-compression?
pal/src/host/linux-sgx/utils/fast_clock.c
line 1 at r1 (raw file):
#include <string.h>
You should use "api.h"
pal/src/host/linux-sgx/utils/fast_clock.c
line 10 at r1 (raw file):
/** * FastClock
Please reformat to 100-chars-per-line.
pal/src/host/linux-sgx/utils/fast_clock.c
line 28 at r1 (raw file):
* *** Implementation *** * * FastClock is implemented as a state machine. This was done since we don't have a good portable way to get the cpu clock frequency.
It would be nicer if you would ASCII-draw the state machine with transitions.
pal/src/host/linux-sgx/utils/fast_clock.c
line 81 at r1 (raw file):
#ifndef SEC_USEC #define SEC_USEC ((uint64_t)1000000) #endif
We have some macro for this in api.h
, please check (I don't remember the name)
pal/src/host/linux-sgx/utils/fast_clock.c
line 84 at r1 (raw file):
#define RDTSC_CALIBRATION_TIME ((uint64_t)1 * SEC_USEC) #define RDTSC_RECALIBRATION_INTERVAL ((uint64_t)120 * SEC_USEC)
How did you decide on these values? Please add a comment.
pal/src/host/linux-sgx/utils/fast_clock.c
line 139 at r1 (raw file):
static inline long reset_timepoint(fast_clock_timepoint_t* timepoint) { int ret = ocall_reset_time(&timepoint->t0_usec, &timepoint->tsc0, &timepoint->tz_minutewest, &timepoint->tz_dsttime);
Need to analyze security and validate received-from-host values
pal/src/host/linux-sgx/utils/fast_clock.c
line 148 at r1 (raw file):
} static inline fast_clock_desc_t desc_atomic_load(const fast_clock_t* fast_clock, int mo)
What's the point in this helper function? It brings no benefit.
pal/src/host/linux-sgx/utils/fast_clock.c
line 155 at r1 (raw file):
} static inline void desc_atomic_store(fast_clock_t* fast_clock, fast_clock_desc_t new_desc, int mo)
What's the point in this helper function? It brings no benefit.
pal/src/host/linux-sgx/utils/fast_clock.c
line 164 at r1 (raw file):
static inline int handle_state_rdtsc(fast_clock_t* fast_clock, fast_clock_desc_t descriptor, uint64_t* time_usec, bool force_new_timepoint); static inline int handle_state_rdtsc_recalibrate(fast_clock_t* fast_clock, fast_clock_desc_t descriptor, uint64_t* time_usec); static int handle_state_rdtsc_disabled(uint64_t* time_usec);
This is ugly, why can't you define functions in the order of dependency?
pal/src/host/linux-sgx/utils/fast_clock.c
line 166 at r1 (raw file):
static int handle_state_rdtsc_disabled(uint64_t* time_usec); int fast_clock__get_time(fast_clock_t* fast_clock, uint64_t* time_usec, bool force_new_timepoint)
We don't use the __
style. Just one _
is enough.
pal/src/host/linux-sgx/utils/fast_clock.c
line 290 at r1 (raw file):
// all callers in this state will perform an OCALL - no need to set the change_state_guard before OCALLing uint64_t tmp_tsc = 0; int ret = ocall_reset_time(time_usec, &tmp_tsc, NULL, NULL);
Need to analyze security and validate received-from-host values
pal/src/host/linux-sgx/utils/fast_clock.c
line 350 at r1 (raw file):
uint64_t tsc = 0; int ret = ocall_reset_time(time_usec, &tsc, NULL, NULL);
Need to analyze security and validate received-from-host values
pal/src/host/linux-sgx/utils/fast_clock.c
line 362 at r1 (raw file):
return ret; }
Please add newlines at the end of each file.
Thanks for the review!
yes, I can remove this. There's currently an "unimplemented" comment in the gettimeofday libos method, as well as issue #466 - do you prefer I fix this as well (at least partially, for gettimeofday())?
This is what I did actually, I renamed
This is the host code (outside of sgx) - should I still use the enclave reimplementation of rdtsc?
I need to rework the initialization a bit. I'll re-add this print, but I'm not sure we need it to happen in the main function.
I changed this to a bitfield in my company's codebase (I'll pull in these changes). I just wanted to make sure this struct fits in a uint32_t which we can load \ store atomically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 11 files reviewed, 28 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jonathan-sha)
pal/src/host/linux-sgx/enclave_ocalls.h
line 92 at r1 (raw file):
do you prefer I fix this as well (at least partially, for gettimeofday())?
Sorry, don't understand your proposal. Fix how?
To make it clear, I suggest the following:
- Remove in this PR everything about timezones.
- Create a follow-up PR to this one that adds timezones (and thus fixes [LibOS] time zone is incorrect in syscall
gettimeofday()
or glibclocaltime()
or bash commanddate
#466).
pal/src/host/linux-sgx/enclave_ocalls.c
line 1825 at r1 (raw file):
I kept
ocall_gettime
for compatibility with the rest of the code.
No reason to keep compatibility. Just change the code. (These are internal Gramine functions, so can be changed without caring about compatibility issues.)
pal/src/host/linux-sgx/host_ocalls.c
line 608 at r1 (raw file):
Previously, jonathan-sha (Jonathan Shamir) wrote…
This is the host code (outside of sgx) - should I still use the enclave reimplementation of rdtsc?
We have this func in cpu.h
:
gramine/common/include/arch/x86_64/cpu.h
Line 66 in fcdd1d5
static inline uint64_t get_tsc(void) { |
The cpu.h
header is independent from enclave/non-enclave environment. So yes, just use it.
pal/src/host/linux-sgx/utils/fast_clock.h
line 39 at r1 (raw file):
Previously, jonathan-sha (Jonathan Shamir) wrote…
I changed this to a bitfield in my company's codebase (I'll pull in these changes). I just wanted to make sure this struct fits in a uint32_t which we can load \ store atomically.
I see. You can change to a bitfield here too.
Also, please add a static_assert(sizeof (fast_clock_desc_t) == sizeof(uint64_t))
to make it explicit. Plus add a comment that this struct must be accessed with atomic ops.
4927cdb
to
7f41646
Compare
@dimakuv I updated the MR, fixed most of your comments. Two big open issues that I would like to get feedback on -
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied to 1. under the same discussion in the code.
should we update the gettimeofday() tests?
We could add/improve some, but I'm not sure if it's worth comparing to the host - I'm afraid that they may fail randomly if we get unlucky with a CI worker load and we'll end up debugging CI failures which are just flaky tests, not real bugs.
We should add a 5-minute stress test.
No, we shouldn't :) 5 minutes is really a lot of CI time spent on a single feature.
Reviewed all commit messages.
Reviewable status: 0 of 12 files reviewed, 31 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @jonathan-sha)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I just briefly looked at the PR and left some comments. Didn't go deep into the logic itself.
Please fix according to the comments, and then we could start the review in earnest.
Same. Will do a more throughout review later.
Suggestion:
[PAL/Linux-SGX] Add ocall_reset_clock which returns tsc and timezone info
-- commits
line 7 at r2:
I don't think there's a point in splitting this change into 4 commits, as far as I understand they can't/shouldn't be used separately, so they aren't really independent of each other.
-- commits
line 19 at r2:
Please use fixup commits when adding fixups, see https://gramine.readthedocs.io/en/latest/devel/contributing.html#pr-life-cycle. And in general, please read that contributing guide :)
pal/src/host/linux-sgx/enclave_ocalls.h
line 92 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
do you prefer I fix this as well (at least partially, for gettimeofday())?
Sorry, don't understand your proposal. Fix how?
To make it clear, I suggest the following:
- Remove in this PR everything about timezones.
- Create a follow-up PR to this one that adds timezones (and thus fixes [LibOS] time zone is incorrect in syscall
gettimeofday()
or glibclocaltime()
or bash commanddate
#466).
+1, I'd skip the timezones problem for now and focus on merging the core functionality first.
pal/src/host/linux-sgx/utils/fast_clock.c
line 226 at r1 (raw file):
Previously, jonathan-sha (Jonathan Shamir) wrote…
Not sure if this method is needed - can we rely on the RDTSC emulation feature?
Maybe we can do the following - have a global
g_is_rdtsc_emulated
which is set under theFIRST_TIME()
guard in the emulator. This method will simply callget_tsc()
then check the global. We can then completely remove this logic frompal_sgx_main
.
Yeah, sounds like a much better idea IMO.
197f2d4
to
f163c04
Compare
…lying on cpuid and steady clock frequency heuristics (fast-clock) Signed-off-by: Jonathan Shamir <[email protected]>
f163c04
to
664de3d
Compare
Yeah, I remember about it, but I was quite busy recently with other stuff. I have some more time now, so I'll continue the review now :) |
@jonathan-sha: Could you reply to all the discussions in Reviewable? There are a few of them from Dmitrii which you left unanswered and it's not obvious whether you fixed the thing pointed out there or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 12 files reviewed, 29 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mkow)
Previously, mkow (Michał Kowalczyk) wrote…
I don't think there's a point in splitting this change into 4 commits, as far as I understand they can't/shouldn't be used separately, so they aren't really independent of each other.
I squashed all commits
pal/src/host/linux-sgx/enclave_ocalls.h
line 92 at r1 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
+1, I'd skip the timezones problem for now and focus on merging the core functionality first.
timezone handling was removed
pal/src/host/linux-sgx/enclave_ocalls.c
line 1825 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I kept
ocall_gettime
for compatibility with the rest of the code.No reason to keep compatibility. Just change the code. (These are internal Gramine functions, so can be changed without caring about compatibility issues.)
Done.
pal/src/host/linux-sgx/host_ocalls.c
line 608 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We have this func in
cpu.h
:gramine/common/include/arch/x86_64/cpu.h
Line 66 in fcdd1d5
static inline uint64_t get_tsc(void) { The
cpu.h
header is independent from enclave/non-enclave environment. So yes, just use it.
Done.
pal/src/host/linux-sgx/pal_main.c
line 561 at r1 (raw file):
Previously, jonathan-sha (Jonathan Shamir) wrote…
I need to rework the initialization a bit. I'll re-add this print, but I'm not sure we need it to happen in the main function.
I added print_warnings_on_disabled_clock_emulation above
pal/src/host/linux-sgx/pal_misc.c
line 319 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
These 2 leaves won't be needed now, right? We can remove them.
Done.
pal/src/host/linux-sgx/utils/fast_clock.h
line 2 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We use
#pragma once
Done.
pal/src/host/linux-sgx/utils/fast_clock.h
line 9 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
There is nothing C++ in our codebase, please remove these guards.
Done.
pal/src/host/linux-sgx/utils/fast_clock.h
line 13 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We have a different coding style (based on Google C/C++ coding style), see https://gramine.readthedocs.io/en/stable/devel/coding-style.html
Done.
pal/src/host/linux-sgx/utils/fast_clock.h
line 19 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I don't like the
FC_
, it's not immediately obvious what it is.Hm, is this enum really needed to be in the header? It looks like it can be embedded in the source file itself. Then it won't be visible in other compilation units, and I'd be fine with
FC_
since it will become local-to-single-file.
Done.
pal/src/host/linux-sgx/utils/fast_clock.h
line 39 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I see. You can change to a bitfield here too.
Also, please add a
static_assert(sizeof (fast_clock_desc_t) == sizeof(uint64_t))
to make it explicit. Plus add a comment that this struct must be accessed with atomic ops.
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 1 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You should use
"api.h"
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 10 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please reformat to 100-chars-per-line.
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 28 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
It would be nicer if you would ASCII-draw the state machine with transitions.
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 81 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We have some macro for this in
api.h
, please check (I don't remember the name)
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 84 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
How did you decide on these values? Please add a comment.
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 139 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Need to analyze security and validate received-from-host values
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 148 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point in this helper function? It brings no benefit.
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 155 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point in this helper function? It brings no benefit.
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 164 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This is ugly, why can't you define functions in the order of dependency?
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 166 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We don't use the
__
style. Just one_
is enough.
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 226 at r1 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Yeah, sounds like a much better idea IMO.
I removed the previous implementation of this - is_rdtsc_available
used to check using cpuid if rdtsc is legal in an sgx enclave on this machine. Since gramine has the tsc emulation feature, what I currently do is check if the emulation was triggered.
In linux_pal_main we still have the get_tsc()
early during start up. If rdtsc isn't supported, this will trigger a fast_clock_disable
on the global g_fast_clock
. This method is kept so we can transition to DISABLED in case we already had a gettimeofday
in flight (during linux_pal_main for some reason) or if the non-global g_fast_clock is used (unit tests etc).
On second thought - another implementation could be to start off as DISABLED, and implement fast_clock_enable
method. This could be done in the pal_main after triggering get_tsc()
, so we separate responsibilities. fast-clock will assume that rdtsc is allowed and pal_main will enable fast-clock only after checking that rdtsc isn't emulated.
pal/src/host/linux-sgx/utils/fast_clock.c
line 290 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Need to analyze security and validate received-from-host values
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 350 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Need to analyze security and validate received-from-host values
Done.
pal/src/host/linux-sgx/utils/fast_clock.c
line 362 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add newlines at the end of each file.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 11 files at r1, 3 of 11 files at r3, all commit messages.
Reviewable status: 4 of 12 files reviewed, 30 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @jonathan-sha)
-- commits
line 2 at r3:
This one-liner is much too long, see https://commit.style/ (no need to strictly follow the 50-char limit, but 134 characters is much too much :) Please move the details to the commit message body.
pal/src/host/linux-sgx/enclave_ocalls.c
line 1769 at r3 (raw file):
} int ocall_gettime(uint64_t* microsec_ptr, uint64_t* tsc_ptr) {
Sorry, old code, we must have missed this when unifying naming conventions.
Suggestion:
uint64_t* out_microsec, uint64_t* out_tsc
pal/src/host/linux-sgx/enclave_ocalls.c
line 1814 at r3 (raw file):
} /* detect malicious host - time and tsc must monotonically increase */
Couldn't we just use tsc_before_ocall
for new_tsc
? The OCALL delay would cancel out, if it's roughtly constant (but and don't know if it is). This would considerably simplify the logic here.
pal/src/host/linux-sgx/meson.build
line 89 at r3 (raw file):
'pal_streams.c', 'pal_threading.c', 'utils/fast_clock.c',
I don't think we need a directory (especially with such a vague name), there would only be a single file there.
Suggestion:
'pal_rdtsc_clock.c',
Description of the changes
This is a re-implementation of _palSystemTimeQuery in a way that doesn't require steady TSC feature and works on previously unsupported hypervisors.
See full explanation and discussion in #1799
Fixes #1799
How to test this PR?
This change is