Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite timeout tracking to be in cpu-time instead of wall-time #9

Merged
merged 8 commits into from
Oct 8, 2024

Conversation

Fayti1703
Copy link
Member

@Fayti1703 Fayti1703 commented Oct 4, 2024

Problem

Script time execution tracking is currently done with wall-clock-time, not cputime. This means differences in the OS scheduler affect script time execution.

Context

This initial implementation is not cross-platform (clock_gettime is a POSIX API). Additionally, this implementation does not yet use the pthread_getcpuclockid function discovered after the implementation work was already complete.

The failing timeout test is due the timeout being induced by sleep 0.1, which does not take up any CPU time.

This implementation uses pthread_getcpuclockid to obtain the CPU clock for the thread running V8 code, then continually polls that clock to check for the execution timeout.

lib/mini_racer.rb Outdated Show resolved Hide resolved
@seanmakesgames
Copy link
Member

testing webhook

lib/mini_racer.rb Outdated Show resolved Hide resolved
Copy link
Member

@seanmakesgames seanmakesgames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking awesome so far! Just need to figure out how to get it to compile on my machine.

lib/mini_racer.rb Outdated Show resolved Hide resolved
@seanmakesgames
Copy link
Member

More more bad news, haha: These versions of macos that the tests are running on are likely no longer supported by github, which is why they aren't getting picked up.

@Fayti1703 Fayti1703 force-pushed the fayti1703/timeout-cputime branch from 998b305 to 15d4eab Compare October 8, 2024 17:50
@Fayti1703 Fayti1703 marked this pull request as ready for review October 8, 2024 18:04
@seanmakesgames
Copy link
Member

validated things are working in hm with this change. (still using timeout, not cputime_limit)

lib/mini_racer.rb Outdated Show resolved Hide resolved
@seanmakesgames seanmakesgames merged commit 63da862 into main Oct 8, 2024
4 checks passed
@seanmakesgames seanmakesgames deleted the fayti1703/timeout-cputime branch October 8, 2024 19:07
@20k
Copy link

20k commented Oct 8, 2024

Just as some notes because I checked out this PR and noticed it was C++, std::atomic<unsigned long> is possibly the wrong type on windows (its 32bit) for mini_racer - though I'm not sure if windows is a platform that's in scope, but you may want std::atomic_uint64_t either way

C++17 atomics are also uninitialised in a very weird way (you can only destruct them, or manually init them), so using atomic operations on it is UB. A default initialised std::atomic must be finished up with std::atomic_init, and then be written to atomically (as the initialisation via std::atomic_init is not atomic. Woo!) 🎈 🍰 . See here or here for details if you're curious

In C++20, it'll be non atomically 0 inited by default

In both C++17, and C++20, using a relaxed memory store to initialise the std::atomic ends up leading to UB later, as relaxed consistency does not enforce ordering constraints, so other threads can race-ily observe the non atomic 0 init (assuming the C++17 version is fixed, because atm its straight UB). The initialisation of the atomic via the evalparams struct, and the initialisation via a write of the atomic, are in different threads via a relaxed write, which is what causes problems as there's no ordering guarantees for any later reads

If nogvl_context_eval is always called before v8_update_cpu_time and rb_context_update_cpu_time, I believe you could use a sequential consistency store. Or you could initialise the atomic directly before initialising the thread, and avoid this problem entirely

As a side note: relaxed consistency is probably not what you want in general here either, as it might result in the timer appearing to fail to update intermittently, and would recommend sequential consistency (as acquire/release isn't worth the mental burden). Its less important on x86 than arm in this context, but I think apple's x86 emulation doesn't emulate the memory ordering by default, so its more likely to cause weird issues there

@Fayti1703
Copy link
Member Author

Fayti1703 commented Oct 9, 2024

Thanks for the detailed notes; it's been a while since I've done C++ and it looks like I ran right into some traps 😅

I do want to mention that the atomic variable is currently very dead code anyway; the current implementation using a pthread_getcpuclockid doesn't touch it at all outside of (failing to) init it and storing 0L. I'll open a PR to remove it entirely.

Just as some notes because I checked out this PR and noticed it was C++, std::atomic is possibly the wrong type on windows (its 32bit) for mini_racer - though I'm not sure if windows is a platform that's in scope, but you may want std::atomic_uint64_t either way

Windows is very much not in scope, since it doesn't support the pthread APIs in the first place (and upstream MR also doesn't support it).

C++17 atomics are also uninitialised in a very weird way (you can only destruct them, or manually init them), so using atomic operations on it is UB. A default initialised std::atomic must be finished up with std::atomic_init, and then be written to atomically (as the initialisation via std::atomic_init is not atomic. Woo!)

That is... extremely wierd. You'd think the point of a constructor is to, uh, initialize the type so it can be used.

In both C++17, and C++20, using a relaxed memory store to initialise the std::atomic ends up leading to UB later, as relaxed consistency does not enforce ordering constraints, so other threads can race-ily observe the non atomic 0 init (assuming the C++17 version is fixed, because atm its straight UB).

The initialisation of the atomic via the evalparams struct, and the initialisation via a write of the atomic, are in different threads via a relaxed write, which is what causes problems as there's no ordering guarantees for any later reads

The atomic variable is actually only ever written from one thread, since code flow starting from rb_context_eval_unsafe never moves threads (it can't, the thread's locking an v8::Isolate; and rb_thread_call_without_gvl does not cause a thread change, only a GVL-unlock). In the previous implementation, there was also a call to v8::Isolate::RequestInterrupt(InterruptCallback, void*), and the callback function for passed there jumps threads to the same thread that V8's currently executing on.

But yes, the read being uninitialized is a good point. I managed to completely miss that while thinking through the semantics of this (which are already non-trivial, since there's an implied happens-before with v8::Isolate::RequestInterrupt's callback function being called to script execution starting)

If nogvl_context_eval is always called before v8_update_cpu_time and rb_context_update_cpu_time, I believe you could use a sequential consistency store. Or you could initialise the atomic directly before initialising the thread, and avoid this problem entirely

Funnily enough, it might actually not be! There is a happens-before relationship here due to linear code flow, but it's "watchdog thread spawned" (which used to call rb_context_update_cpu_time after 100ms) happens-before "nogvl_context_eval is called". Whoops!

As a side note: relaxed consistency is probably not what you want in general here either, as it might result in the timer appearing to fail to update intermittently

Due to how v8::Isolate::RequestInterrupt works (which was used to update the timer before), there was always a possibility that the timer would not be updated intermittently, so I didn't see that as an issue.

and would recommend sequential consistency (as acquire/release isn't worth the mental burden).

Noted for future reference, thanks.

@seanmakesgames
Copy link
Member

@20k Thanks for the 👀 and your thoughts!

@ast-ral
Copy link

ast-ral commented Oct 10, 2024

Quick drive-by comment: I actually wouldn't recommend going with sequential consistency unless you specifically know that you need the properties it gives. Acquire/release semantics are what you want in the majority of cases.

@ast-ral
Copy link

ast-ral commented Oct 10, 2024

Perhaps put more eloquently:

It's a commonly known fact that for any piece of code that is correct, you can change all the atomic operations to have sequentially consistent ordering and it will still be correct. This fact often leads to people advocating for using nothing but sequential consistency. However, no matter what atomic ordering you use, you have to do quite a bit of thinking to make sure your code is actually definitely 100% correct. In doing so, you'll often find that acquire/release or even relaxed semantics are the correct choice and that the additional guarantees of sequential consistency are unnecessary. This is why actually seeing a sequentially consistent operation in code is an immediate red flag to me: it means that the author either didn't actually think about their code, and assumed that sequential consistency will just "make it work" or that the code relies on fairly esoteric/uncommon guarantees regarding the existence of a globally consistent modification order on atomic operations. In either case, it's an indication that you should read very closely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants