Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.1] Determine expression tool output extension when input terminal #19364

Draft
wants to merge 2 commits into
base: release_24.1
Choose a base branch
from

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Dec 20, 2024

Fixes #19245 by waiting for the dataset to become terminal if it's a expression.json dataset that we infer the output datatype from.
A slightly better implementation could look at the tool outputs when we generate the command line, which is when we're going to submit and have the terminal datasets ... ?

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@mvdbeek mvdbeek force-pushed the determine_expression_tool_output_extension_when_input_terminal branch from eeacd6c to 5066bc2 Compare December 20, 2024 21:44
@mvdbeek
Copy link
Member Author

mvdbeek commented Dec 20, 2024

We might be adding objects to the sqlalchemy session (and maybe even committing them?) before hitting the ToolInputsNotReadyException. This would be a good place to use session.begin_nested() and session.rollback() to undo
staged changes, but I couldn't get that to work without changing the entire commit strategy.

@jdavcs
Copy link
Member

jdavcs commented Jan 9, 2025

We might be adding objects to the sqlalchemy session (and maybe even committing them?) before hitting the ToolInputsNotReadyException. This would be a good place to use session.begin_nested() and session.rollback() to undo staged changes, but I couldn't get that to work without changing the entire commit strategy.

So, if we hit this exception, to what savepoint should we rollback? i.e., where would the nested transaction start? (The method is called from determine_output_format which is called from handle_output). I'm not sure what session state we'd want to preserve and what to discard.

@mvdbeek
Copy link
Member Author

mvdbeek commented Jan 10, 2025

So, if we hit this exception, to what savepoint should we rollback?

To where the try scope starts,

rval = self._execute(

@jdavcs
Copy link
Member

jdavcs commented Jan 10, 2025

Seems not impossible, but potentially tricky. So, the idea is this (restating the obvious here..):

# session at state 1
inner_trans = session.begin()
try:
    # do stuff
except ToolInputsNotReadyException:
    inner_trans.rollback()  # rollback to state 1
# continue using session

The tricky part: the "do stuff" is large and includes at least a few commits (I haven't traced everything though). A commit will break inner_trans.rollback() since there's no open transaction to rollback. Replacing all session.commit() calls with inner_trans.commit() doesn't work either: after one such call the SAVEPOINT is released and the inner transaction doesn't exist anymore, so no rollback is possible.

However, if instead of committing inside the try/except scope we flush, then it should work: flushing does not end the transaction, so we can still rollback the inner transaction. HOWEVER, this raises the bigger question: is it safe to flush instead of committing? We replaced flushes with commits as part of migrating to SA 2.0 (SA <1.4 committed implicitly on flush with our auto, ref #15421) to ensure we were not modifying behavior. Now that we are safely on 2.0, it may be time to reevaluate all those commits and replace them with flushes if possible. (as a reminder, a flush writes changes to the database's temporary buffer which is discarded on rollback and persisted on commit). This doesn't have to be all or nothing: we can start with this use case - i.e., wrap this scope in a nested transaction, replacing all commits with flushes.

Another potentially tricky part is making this work with sqlite. As per SA's docs, workarounds are needed to make SAVEPOINT function correctly. How much the proposed workarounds would affect us, I don't know yet.

@mvdbeek
Copy link
Member Author

mvdbeek commented Jan 10, 2025

Thanks for looking at this.

we can start with this use case

that sounds good to me, I don't think it makes sense to alter this on a global scale. There's not a lot of code paths where we would benefit from being able to roll back a transaction, but this is one.

I think I have a decent alternative workaround for this bug that we can apply to 24.1, if we want to go with the flush strategy we can target dev.

@mvdbeek mvdbeek mentioned this pull request Jan 10, 2025
4 tasks
@jdavcs
Copy link
Member

jdavcs commented Jan 10, 2025

...if we want to go with the flush strategy we can target dev.

Would you like me to give it a try?

@mvdbeek
Copy link
Member Author

mvdbeek commented Jan 10, 2025

If you have the time I think that would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants