Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement for step execution retrieval #4208

Closed
wants to merge 1 commit into from
Closed

Performance improvement for step execution retrieval #4208

wants to merge 1 commit into from

Conversation

hpoettker
Copy link
Contributor

This PR is for versions 4.3.x and addresses #3790 and #4135.

Using JobExplorer::getStepExecution for multiple step executions is not ideal. Each invocation will retrieve the corresponding job execution which in turn triggers the retrieval of all step executions of the job execution. If JobExplorer::getStepExecution is used for all step executions, this leads to efforts and memory consumption that scale quadratically in the number of step executions.

The idea of the PR is to only call JobExplorer::getJobExecution and get the step executions from the returned job execution. The PR is intended to be conservative in its changes due to the late stage of the 4.3.x releases. That's why the filtering on the step execution ids are implemented although I'm not 100% sure that they are actually needed in every case.

I tested the change locally with the reproducing project provided here: #3790 (comment) With 250 MB of heap, the application no longer crashes and the memory profile looks much better.

For Spring Batch 5, it makes sense to forward port the change, I think. It should also be considered to deprecate JobExplorer::getStepExecution and provide a way to retrieve a StepExecution from a JobExecution by id instead.

@vitenkov
Copy link

do you guys have any expectation when the fix be pushed to spring-batch?

@fmbenhassine
Copy link
Contributor

The idea of this PR is a clever trick and a nice middle ground between having the performance issue and fixing it with a single query for polling + shallow copies of step executions as mentioned in #3790 (comment).

I appreciate the fact that you tried to fix the problem with 4.3.x in mind! This PR has been rebased and merged in v4.3.x and forward ported to v5 (without the JSR related changes, see 93800c6). Thank you for your contribution!

@hpoettker hpoettker deleted the get-step-executions-once branch February 22, 2023 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants