Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option to avoid compiling a method that's known to throw for any input #3

Open
nsajko opened this issue Jan 2, 2025 · 11 comments
Open

Comments

@nsajko
Copy link

nsajko commented Jan 2, 2025

If a method is known to throw for any input, I think it would be good to avoid compiling it by default, or at least provide such an option.

Here's code that approximates the desired predicate of "known to throw":

function does_not_return(f, arg_types)
    Base.infer_return_type(f, arg_types) <: Union{}  # not public, sadly
end

function does_not_throw(f, arg_types)
    Base.infer_exception_type(f, arg_types) <: Union{}  # not public, sadly
end

function known_to_throw_unless_it_runs_forever(f, arg_types)
    may_throw = !does_not_throw(f, arg_types)
    may_throw && does_not_return(f, arg_types)
end
@nsajko
Copy link
Author

nsajko commented Jan 2, 2025

An issue is that the functions Base.infer_return_type and Base.infer_exception_type aren't public. The bright side is that, for Julia v1.12 and up, the compiler is a regular package, Compiler, so it will be possible to support this without depending on internals.

@nsajko
Copy link
Author

nsajko commented Jan 2, 2025

For Julia in general, this should work:

function known_not_to_return(f, arg_types)
    if (
        isdefined(Base, :infer_return_type) &&
        applicable(Base.infer_return_type, arg_types)
    )
        Base.infer_return_type(f, arg_types) <: Union{}  # not public, sadly
    else
        false
    end
end
function known_not_to_throw(f, arg_types)
    if (
        isdefined(Base, :infer_exception_type) &&
        applicable(Base.infer_exception_type, arg_types)
    )
        Base.infer_exception_type(f, arg_types) <: Union{}  # not public, sadly
    else
        false
    end 
end
function known_to_throw_unless_it_runs_forever(f, arg_types)
    may_throw = !known_not_to_throw(f, arg_types)
    may_throw && known_not_to_return(f, arg_types)
end

@jakobjpeters
Copy link
Owner

jakobjpeters commented Jan 2, 2025

That's a really good idea! Here's a quick test.

v0.1.1

julia> speculate(all_modules; verbosity = review)
┌ Info: Generated `6358` methods from `26794` generic methods in `124.9810` seconds
│ Compiled `2622`
│ Skipped  `3736`
└ Warned   `0`

Using does_not_return

This skips 51 specializations, and doesn't noticeably increase the search-time upon a repeated call.

julia> speculate(all_modules; verbosity = review)
┌ Info: Generated `6358` methods from `26794` generic methods in `124.2046` seconds
│ Compiled `2571`
│ Skipped  `3787`
└ Warned   `0`

Using does_not_throw

This was initially surprising to me; am I understanding correctly that it's skipping methods that may throw but not necessarily? It also doubles the runtime upon a repeated call.

julia> speculate(all_modules; verbosity = review)
┌ Info: Generated `6358` methods from `26794` generic methods in `43.7824` seconds
│ Compiled `827`
│ Skipped  `5531`
└ Warned   `0`

Using known_to_throw_unless_it_runs_forever

Seems like does_not_return is a subset of does_not_throw.

julia> speculate(all_modules; verbosity = review)
┌ Info: Generated `6358` methods from `26794` generic methods in `44.4571` seconds
│ Compiled `827`
│ Skipped  `5531`
└ Warned   `0`

I'm currently leaning towards including the check for does_not_return, but leaving out the check for does_not_throw. I had initially experimented with using search flags such as union_types, method_types, abstract_subtypes, etc. However, it felt too complicated and required users to know about internals in some cases. The predicate is a much more friendly API in my opinion, even though it is limited to filtering module names and (in v0.2.0) parameter types.

A bit off-topic regarding internals, this package relies on them heavily anyways. It would be nice for me to go through and try to see if any of the current internal usage can be replaced by a dependency on Compiler to try to minimize the surface area, so thank you for the suggestion! Here's a rough sketch of places it currently uses internals:

  • Detecting and skipping built-in functions
  • Checking method.nospecialize to determine whether or not to search for subtypes
  • Accessing the method.sig.types
  • Using Base.specializations to skip methods that have already been specialized
  • Skipping names that satisfy Base.isdeprecated
  • Checking @ccall jl_generating_output()::Cint
  • Pushing a hook into Base.active_repl_backend.ast_transformations
  • Searching all_modules using Base.loaded_modules_array()
  • ... Several more ...

@nsajko
Copy link
Author

nsajko commented Jan 2, 2025

am I understanding correctly that it's skipping methods that may throw but not necessarily?

The does_not_throw predicate (renamed to known_not_to_throw in my later comment) tells whether Julia knows for certain that a method will never throw.

I'm currently leaning towards including the check for does_not_return, but leaving out the check for does_not_throw.

On its own, does_not_return tells if a method is known never to return. This can happen in multiple cases, including two basic cases:

  • never returns implies either:
    • basic case: runs forever
    • basic case: always throws
    • mixes of these basic cases

The always throws case is the one we want to find to avoid compiling the method. The runs forever case is rare in practice, but when it happens it's probably desirable to compile (it could be the main loop of an application). The rationale behind using does_not_throw is to separate some of the runs forever cases from the rest of never returns, to allow them to be compiled.

There's probably a different approach that would allow greater accuracy, but this is what I came up with. In particular, it might make sense to look into what JET is doing.

@jakobjpeters
Copy link
Owner

Oh, I misunderstood the intent and didn't correctly implement the combined approach. Thank you for explaining it further. I'll write some tests and try it out soon :)

@jakobjpeters
Copy link
Owner

Does this reflect your intent correctly?

function count_specializations(f)
    ...
    speculate(f; ...)
    ...
end

f() = error()
g() = while true end
function h()
    while true
        rand(Bool) && error()
    end
end

@test 0 == count_specializations(f)
@test 1 == count_specializations(g)
@test 1 == count_specializations(h)

@nsajko
Copy link
Author

nsajko commented Jan 3, 2025

f always throws so by this proposal the test above is OK, we don't want it compiled.

g never returns but also never throws, running forever, so I think it's good to have it compiled, in principle. That said, if the cost of calling Base.infer_exception_type is too high, perhaps it may make sense to avoid compiling g by default, too. So I guess there's a trade off here, and one would have to do some experimentation on a real code base to be able to tell which approach is better.

h never returns, belonging to a mixed case I referred to above. In principle I think h is representative of a possible realistic event loop design (in practice rand(Bool) can't keep returning the same value forever, so h is guaranteed to throw eventually, but I don't think abstract interpretation knows about this), so I'd say h would ideally be compiled by default. That said, the predicate I propose sadly isn't granular enough here.

NB: I'm sure it's possible to implement a more accurate predicate, that would return true only if a method is known to throw, however I don't know how to implement this yet.

@jakobjpeters
Copy link
Owner

That said, if the cost of calling Base.infer_exception_type is too high, perhaps it may make sense to avoid compiling g by default, too

I think that checking known_not_to_return prior to known_not_to_throw alleviates the performance cost of the latter

@jakobjpeters
Copy link
Owner

jakobjpeters commented Jan 3, 2025

one would have to do some experimentation on a real code base to be able to tell which approach is better.

It seems unrealistic that (m)any current codebase(s) exceeds runtime of speculate(all_modules; dry = true), which only takes a few seconds. It is possible that there are different performance characteristics when considering limit > 1, which could run nearly arbitrarily long. The point being that testing with all_modules seems like a reasonable place to start.

@jakobjpeters
Copy link
Owner

Do you know an example where JET.jl is able to distinguish between always throw and may loop forever or throw?

julia> @report_call f()
═════ 1 possible error found ═════
┌ f() @ Main ./REPL[28]:1
│┌ error() @ Base ./error.jl:42
││ may throw: Base.throw(Base.ErrorException((Base).string::typeof(string)(s::Tuple{}...)::String)::ErrorException)
│└────────────────────

julia> @report_call g()
No errors detected

julia> @report_call h()
═════ 1 possible error found ═════
┌ h() @ Main ./REPL[30]:3
│┌ error() @ Base ./error.jl:42
││ may throw: Base.throw(Base.ErrorException((Base).string::typeof(string)(s::Tuple{}...)::String)::ErrorException)

@nsajko
Copy link
Author

nsajko commented Jan 4, 2025

Seems to be such an example:

function f(in::IO, out::IO)
    line = readline(in)
    n = parse(Int, line)
    show(out, n)
    print(out, '\n')
end

function g(in::IO, out::IO)
    while true
        f(in, out)
    end
end

function g()
    g(stdin, stdout)
end
julia> using JET

julia> @report_call g()
No errors detected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants