Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix segfault for decoupled models #327

Merged
merged 4 commits into from
Dec 15, 2023
Merged

Fix segfault for decoupled models #327

merged 4 commits into from
Dec 15, 2023

Conversation

krishung5
Copy link
Contributor

@krishung5 krishung5 commented Dec 12, 2023

This PR fixed the segfault issues:

  • when an exception is raised within a decoupled model. We expected users to handle error by returning an error response with a final flag like this https://github.com/triton-inference-server/server/blob/main/qa/L0_backend_python/decoupled/models/decoupled_execute_error/1/model.py#L95-L106. This fix should also cover the use case where an exception is raised by calling raise Exception("Intentional Error") without sending any final flag via response sender.
  • when there are lots of concurrent requests to decoupled models, the request object may have the same address which causes issues when using that address as id to retrieve the response factory from the map. This PR removed the usage of response_factory_map. The backend will create a new response factory for each request, no matter if the request is a rescheduled one or not, and clean up the response_factory when response_sender goes out of scope in the Python model to avoid memory growth.

Added testing: triton-inference-server/server#6686

src/python_be.cc Outdated Show resolved Hide resolved
@krishung5 krishung5 requested a review from rmccorm4 December 12, 2023 02:36
src/python_be.cc Outdated Show resolved Hide resolved
@krishung5 krishung5 requested a review from nnshah1 December 12, 2023 22:28
Copy link
Member

@Tabrizian Tabrizian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it going to be cherry-picked for 23.12?

@krishung5
Copy link
Contributor Author

krishung5 commented Dec 13, 2023

Is it going to be cherry-picked for 23.12?

This is to fix a segfault so I think we could if we have not staged the container. @tanmayv25 May I have your thoughts on this?

@krishung5 krishung5 changed the title Set release flags and clean up response factory map before returning … Fix segfault for decoupled models Dec 14, 2023
@krishung5 krishung5 requested a review from Tabrizian December 14, 2023 22:57
@krishung5 krishung5 merged commit c5f304d into main Dec 15, 2023
3 checks passed
krishung5 added a commit that referenced this pull request Dec 15, 2023
* Set release flags and clean up response factory map before returning error

* Address comments

* Move the cleanup function to the outside scope

* Delete response factory when response sender goes out of scope
krishung5 added a commit that referenced this pull request Dec 15, 2023
* Set release flags and clean up response factory map before returning error

* Address comments

* Move the cleanup function to the outside scope

* Delete response factory when response sender goes out of scope
mc-nv pushed a commit that referenced this pull request Dec 20, 2023
* Set release flags and clean up response factory map before returning error

* Address comments

* Move the cleanup function to the outside scope

* Delete response factory when response sender goes out of scope
@krishung5 krishung5 deleted the krish-decoupled branch May 30, 2024 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants