Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assistant Archival Cleanup #1011

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

stephherbers
Copy link
Contributor

Description

990
Followup logic changes for assistant versioning archiving.

Now, when a assistant is archived that is a working version, all its versions is also archived and deleted from open AI.

Also from a UI standpoint. When it checks whether is can be archived, if its the working version it checks for all dependencies for all the versions not just the working version and displays everything that needs to be archived first. This way, the version won't be archived with the open AI deleted with something (exp or pipeline) still referencing it)

User Impact

Demo

Docs

@codecov-commenter
Copy link

codecov-commenter commented Dec 19, 2024

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
854 1 853 0
View the top 1 failed tests by shortest run time
apps/assistants/tests/test_delete.py::TestAssistantArchival::test_archive_assistant_fails_with_working_related_pipeline
Stack Traces | 0.626s run time
.../assistants/tests/test_delete.py:148: in test_archive_assistant_fails_with_working_related_pipeline
    assert assistant.is_archived is False  # archiving failed
E   assert True is False
E    +  where True = <OpenAiAssistant: >.is_archived

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

version_query = None
if assistant.is_working_version:
version_query = list(
map(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but I don't think we have to map the ids to strings?

return self.experiment_set.filter(is_archived=False)

def get_related_pipeline_node_queryset(self):
def get_related_pipeline_node_queryset(self, query=None):
Copy link
Collaborator

@SmittieC SmittieC Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the query param is a list of ids, I suggest that we name and type it accordingly:

def get_related_pipeline_node_queryset(self, assistant_ids: list):

We could even pass in a queryset that returns the ids. We did something similar here for example. This wouldn't require parsing the query results to a list, which is nice, but up to you on which approach you'd like to take.

Also, looks like this parameter will come in useful to optimize the archive method. So instead of iterating through all assistant versions (in the case where we're archiving the working version), we can simply fetch the version ids and call this method with it. Same with get_related_experiments_queryset.

Copy link
Collaborator

@snopoke snopoke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing this and testing it locally I think we need to take a step back and look more holistically at how we trace references to objects and make that visible in the UI.

I've started a doc to make it easier to collaborate on the thoughts and ideas: https://docs.google.com/document/d/1Z09GNpO17izoVcRPGSmppNthFruyRJ0033T-RYvapoE/edit?tab=t.0

Comment on lines 95 to 96
experiment = ExperimentFactory(pipeline=pipeline)
experiment.assistant = v2_assistant
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't a valid state - and experiment should either have an assistant or a pipeline but not both

assert assistant.is_archived is True # archiving successful

@patch("apps.assistants.sync.push_assistant_to_openai", Mock())
def test_archive_versioned_assistant_with_still_exisiting_experiment_and_pipeline(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems to be the same as the previous one since the presence of the pipeline has no effect because it does't reference the assistant

pipeline = PipelineFactory()
NodeFactory(type="AssistantNode", pipeline=pipeline, params={"assistant_id": str(assistant.id)})
exp = ExperimentFactory(pipeline=pipeline)
exp.assistant = assistant
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you shouldn't set this property since you're testing the reference through the pipeline

Comment on lines 120 to 123
NodeFactory(type="AssistantNode", pipeline=v2_exp.pipeline, params={"assistant_id": str(v2_assistant.id)})
NodeFactory(type="AssistantNode", pipeline=v2_exp.pipeline, params={"assistant_id": str(v2_assistant.id)})
v2_exp.assistant = v2_assistant
v2_exp.save()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what's happening here. Creating a new version should take care of creating versioned nodes as well.

or version.get_related_pipeline_node_queryset().exists()
):
return False
for version in self.versions.all():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than re-iterating over the queryset you could accumulate the IDs in the previous loop

@stephherbers
Copy link
Contributor Author

noting that we are moving forward with changes from the doc so those commits will be coming in later and I will re-requests reviews

def test_archive_assistant_succeeds_with_released_experiment_experiment(self):
experiment = ExperimentFactory()
exp_v2 = experiment.create_new_version()
experiment.is_default = True
Copy link
Collaborator

@SmittieC SmittieC Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to make exp_v2 the default here?

exp_v2.is_default_version = True

Edit: exp_v2 should be exp_v1

experiment.is_default = True
experiment.save()
assistant = OpenAiAssistantFactory()
exp_v2 = experiment.create_new_version()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Since this will be version 2 (exp_v2), the previous exp_v2 should be exp_v1


def get_related_pipeline_node_queryset(self):
def get_related_pipeline_node_queryset(self, assistant_ids: list = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should check the experiments that the pipeline belongs to instead of the pipeline itself?

all_files = list(code_resource.files.all())
assert not any(f.external_id for f in all_files)
assert not any(f.external_source for f in all_files)
class TestAssistantArchival:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests. It would be great if we can add tests for pipelines using an assistant as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants