Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.1] Fix quota usage with user object stores #19323

Conversation

davelopez
Copy link
Contributor

Attempt to fix #19320

How to test the changes?

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@davelopez
Copy link
Contributor Author

I'm not sure if 9c29c90 is the proper fix, probably not. It does fix the issue if I try to recalculate my quota in the UI, but it breaks other quota usage unit tests in a way I don't understand...

I'll keep investigating, but any hint or confirmation would be much appreciated 😃

@davelopez davelopez requested a review from jmchilton December 13, 2024 14:17
@davelopez
Copy link
Contributor Author

The resulting SQL query is something like this:

UPDATE galaxy_user SET disk_usage = (
WITH per_user_histories AS
(
    SELECT id
    FROM history
    WHERE user_id = :id
        AND NOT purged
),
per_hist_hdas AS (
    SELECT DISTINCT dataset_id
    FROM history_dataset_association
    WHERE NOT purged
        AND history_id IN (SELECT id FROM per_user_histories)
)
SELECT COALESCE(SUM(COALESCE(dataset.total_size, dataset.file_size, 0)), 0)
FROM dataset
LEFT OUTER JOIN library_dataset_dataset_association ON dataset.id = library_dataset_dataset_association.dataset_id
WHERE dataset.id IN (SELECT dataset_id FROM per_hist_hdas)
    AND library_dataset_dataset_association.id IS NULL
    AND dataset.object_store_id NOT LIKE 'user_objects://%'
      
)
WHERE id = :id

It works both in Galaxy and directly against the database, but it always returns 0 in the unit tests 🤔

@jmchilton
Copy link
Member

I would think the "correct solution" would be to gather all the object_store_ids for the user and build a virtual quota source with quota turned off and assign those all to that quota source and this would all just happen - but I think it might be less performant than this approach. This seems like a good shortcut though with better performance. There are probably some other queries that need to be updated - I think there is one that adjusts the quota for a new dataset and one that redoes all of the quota for a given user and they will need to be in sync I think.

@davelopez davelopez force-pushed the 24.1_fix_quota_usage_with_user_object_stores branch from 8bef42f to 9debd5d Compare December 16, 2024 16:38
@davelopez
Copy link
Contributor Author

I think I've fixed the condition and now the quota unit tests are passing with d408655

I also fixed the updated quota on dataset upload with this condition. Thanks for the hint on that @jmchilton!

The other 2 queries, I think are covered by the same change in UNIQUE_DATASET_USER_USAGE as this is reused in:

def calculate_user_disk_usage_statements(user_id, quota_source_map, for_sqlite=False):

and

def calculate_disk_usage_default_source(self, object_store):

Hopefully, all should be covered now so I'll take it out of draft. Do you think there might be some other missing case?

@davelopez davelopez marked this pull request as ready for review December 17, 2024 12:31
@github-actions github-actions bot added this to the 24.1 milestone Dec 17, 2024
@davelopez
Copy link
Contributor Author

Test failures seem unrelated. It should be ready to go.

@davelopez davelopez requested a review from a team December 18, 2024 10:56
@bgruening
Copy link
Member

@jmchilton does this look good to you?

Uploading a dataset to a private object store will not be counted
against the default quota source now.
To avoid confusion with "private" as in "not shareable" object stores.
And to make it clear that these are user-defined object stores.
This fixes the unit tests because when the dataset.object_store_id was null, the condition was also evaluated to null.
@davelopez davelopez force-pushed the 24.1_fix_quota_usage_with_user_object_stores branch from d408655 to c42245c Compare January 13, 2025 12:25
Copy link
Member

@jdavcs jdavcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@davelopez
Copy link
Contributor Author

Thanks for the review! The test failures are unrelated, I will merge now.

@davelopez davelopez merged commit 2e91660 into galaxyproject:release_24.1 Jan 23, 2025
45 of 51 checks passed
@davelopez davelopez deleted the 24.1_fix_quota_usage_with_user_object_stores branch January 23, 2025 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants