Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose additional beaker caching backends #15349

Merged
merged 15 commits into from
Jun 21, 2023
Merged

Expose additional beaker caching backends #15349

merged 15 commits into from
Jun 21, 2023

Conversation

claudiofr
Copy link
Contributor

Closes #15216

Convert beaker mulled resolution, citations and biotools service caches to use the database rather than the file system.

Introduce new config parameters for the beaker cache database url, schema name and cache table name.

Introduce logic to default values of a parameter to values of another parameter. i.e. default cache database url to database_connection. Metadata for this was placed in code. Should probably be placed eventually in config file which would require a change to the config_schema.yml.
Introduced a new unit tests, test_mulled_resolution_cache_db.py, test_citations_db.py
Did not introduce a new test for the biotools_service beaker cache because I could not find a place in the code where it was used.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@claudiofr
Copy link
Contributor Author

I just looked at why my new test, test_resolution_cache_db.py failed in my fork's workflow. Apparently, the latest Beaker package 1.12.0 was installed there, whereas I did my local development with 1.11.0. It works fine in 1.11.0 but when I upgraded to 1.12.0 it fails. I need to work out why this is so.

@dannon dannon modified the milestones: 23.0, 23.1 Jan 22, 2023
@claudiofr
Copy link
Contributor Author

I created a simple test case demonstrating the issue in Beaker 1.12.0 and working in 1.11.0 and opened an issue with it on the Beaker github page. In the meantime, can we revert the main code base back to Beaker 1.11.0 because who knows how long it will take them to fix this.

@mvdbeek
Copy link
Member

mvdbeek commented Jan 31, 2023

It looks like the database connection isn't released on app shutdown, is there a hook in beaker to do that ? I think that's why the integration tests are eventually failing.

@claudiofr
Copy link
Contributor Author

claudiofr commented Feb 1, 2023 via email

@claudiofr
Copy link
Contributor Author

claudiofr commented Feb 1, 2023 via email

@mvdbeek
Copy link
Member

mvdbeek commented Feb 2, 2023

I don't think that will work, the integration tests call the app.shutdown hook that releases all resources in the process of shutting down the instance and starting a new one (

self._app.shutdown()
). This all happens in the same process.

@claudiofr
Copy link
Contributor Author

I changed the config_schema.yml setting the default cache table names for the 3 beaker caches, mulled resolution, citations, biotools, to the same value, beaker_cache rather than having separate tables. Because beaker creates a separate sqlalchemy engine and associated connection pool for each distinct table this will now create a single connection pool for all 3 caches rather than 3 connection pools. This in turn cuts down on the total number of potentially active connections by about 2/3's. As a consequence the integration tests are no longer failing.
Some of the other testing workflows are failing but as far as I can tell it has nothing to do with my changes.
Also support for database caches was broken in beaker v 1.12.0. I created an issue for it in the beaker repository and they fixed it in v 1.12.1. I have specified v 1.11.0 in the various requirements.txt files to play it safe.

Comment on lines 41 to 46
"cache.type": getattr(config, "citation_cache_type", "ext:database"),
"cache.data_dir": getattr(config, "citation_cache_data_dir", None),
"cache.lock_dir": getattr(config, "citation_cache_lock_dir", None),
"cache.url": getattr(config, "citation_cache_url", None),
"cache.table_name": getattr(config, "citation_cache_table_name", None),
"cache.schema_name": getattr(config, "citation_cache_schema_name", None),
Copy link
Member

@mvdbeek mvdbeek Feb 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should still default to file, but I also think the getattr isn't necessary at all and predates a refactoring of the config instance. All config attributes are always set now, so

Suggested change
"cache.type": getattr(config, "citation_cache_type", "ext:database"),
"cache.data_dir": getattr(config, "citation_cache_data_dir", None),
"cache.lock_dir": getattr(config, "citation_cache_lock_dir", None),
"cache.url": getattr(config, "citation_cache_url", None),
"cache.table_name": getattr(config, "citation_cache_table_name", None),
"cache.schema_name": getattr(config, "citation_cache_schema_name", None),
"cache.type": config.citation_cache_type,
"cache.data_dir": config.citation_cache_data_dir,
"cache.lock_dir": config.citation_cache_lock_dir,
"cache.url": config.citation_cache_url,
"cache.table_name": config.citation_cache_table_name,
"cache.schema_name": config.citation_cache_schema_name,

should work too

Comment on lines 19 to 24
"cache.type": getattr(config, "biotools_service_cache_type", "ext:database"),
"cache.data_dir": getattr(config, "biotools_service_cache_data_dir", None),
"cache.lock_dir": getattr(config, "biotools_service_cache_lock_dir", None),
"cache.url": getattr(config, "biotools_service_cache_url", config.database_connection),
"cache.table_name": getattr(config, "biotools_service_cache_table_name", None),
"cache.schema_name": getattr(config, "biotools_service_cache_schema_name", None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"cache.type": getattr(config, "biotools_service_cache_type", "ext:database"),
"cache.data_dir": getattr(config, "biotools_service_cache_data_dir", None),
"cache.lock_dir": getattr(config, "biotools_service_cache_lock_dir", None),
"cache.url": getattr(config, "biotools_service_cache_url", config.database_connection),
"cache.table_name": getattr(config, "biotools_service_cache_table_name", None),
"cache.schema_name": getattr(config, "biotools_service_cache_schema_name", None),
"cache.type": config.biotools_service_cache_type,
"cache.data_dir": config.biotools_service_cache_data_dir,
"cache.lock_dir": config.biotools_service_cache_lock_dir,
"cache.url": config.biotools_service_cache_url,
"cache.table_name": config.biotools_service_cache_table_name,
"cache.schema_name": config.biotools_service_cache_schema_name,

Comment on lines 1 to 2
beaker==1.11.0 ; python_version >= "3.7" and python_version < "3.12"
sqlalchemy==1.4.46 ; python_version >= "3.7" and python_version < "3.12"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package requirements are ideally unpinned. It seems that 1.12.1 fixes your issue, so I think we should remove the pin here (ideally for sqlalchemy too, or if 2.0 fails, to pin it to sqlalchemy<=2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as pinning package versions in requirements.txt is concerned this has always been a best practice in my experience. In fact, the main galaxy requirements.txt specifies specific versions or ranges of versions for all referenced packages. If you do not pin a referenced package to a specific version or a version range there is no way to guarantee that what gets deployed is the same thing that was tested. This is especially true for third party packages as we have seen with the bug introduced in Beaker v1.12.0. I would think that test-requirements.txt should be consistent with the main requirements.txt file.

Copy link
Member

@mvdbeek mvdbeek Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's typically a distinction being made between libraries and applications that are shipped to users/deployers. Galaxy as an application pins all dependencies hard (i.e to specific, tested versions, see https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/dependencies/pinned-requirements.txt), while libraries should be pinned lightly, so that as the author of an application you have a good chance of being able to pick a set of compatible requirements.

And these packages here are published individually to pypi for re-use in other libraries or applications.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to Marius' comment, the central list of compatible "light" pinnings for Galaxy Python dependencies is in pyproject.toml , and the various *requirements.txt files inside packages/ should mirror them, see my other comment.

@@ -331,6 +331,8 @@ def _configure_toolbox(self):
mulled_resolution_cache = CacheManager(**parse_cache_config_options(cache_opts)).get_cache(
"mulled_resolution"
)
# If using database cache clear cache table contents
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be persistent across processes and restarts, what was the reason to change this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't know the requirements for these caches. How long are entries expected to stay in the caches? I made this change based upon your comment in the meeting that you didn't want the default cache type to be database because you were concerned that data would accumulate in the cache table and not be cleaned out. I also assumed that in a production environment app restarts would be very infrequent. Is that not the case? So I figured this would be a good place to clean out the table to address your concern. I presume data accumulation would also be a problem with a file based cache. So why is there a concern specifically with the database cache?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restarts can be frequent, for config updates, or because of resources limits, and they can be simultaneous or staggered, so I don't think this is safe.

It's not specific to the database cache, but it's safe to assume people know how to delete a file, while dropping the appropriate table requires a bit more knowledge. The lifetime is typically unlimited, but we may have to drop data occasionally, that for instance was necessary when admins upgraded python versions with incompatible pickle protocols.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a long answer to say that I would default to a file-based cache for simplicity reasons as you're doing now, and database is a good option for production deployments with kubernetes, where you can run out of (very low) file lock limits. If the cache needs to be cleared admins have to do this manually, and that's easier with files.

@claudiofr
Copy link
Contributor Author

I had originally mentioned that the beaker docs say that the lock_dir is always required implying that using a database cache would not fix the galaxyproject/galaxy-helm#399 issue in which there were problems with the file system lock directory.
However, I looked at the beaker source code and it appears that when you use a database cache it does not use the file system for locking. Instead it relies on database locking. So it turns out that using a database cache should address issue 399 also.

@mvdbeek
Copy link
Member

mvdbeek commented Feb 22, 2023

That's great news, thanks for checking!

Rather than store cache data in the file system store cache data
in the database. Partially addresses issue 15216. For now this was
only changed for the mulled resolution cache. Introduced new
config parameters for database url, cache table name. Introduced
logic to default values of a parameter to values of another
parameter. i.e. default cache database url to database_connection.
Metadata for this was placed in code. Should probably be
placed eventually in config file which would require a change
to the config_schema.yml.

Introduced a new unit test, test_mulled_resolution_cache_db.py.
ext:database.

Rather than store cache data in the file system store cache data
in the database. These are the last 2 beaker caches that need
to be converted to the database as part of issue 15216.
Introduce new config parameters for database url, cache table name.
Introduce logic to default values of a parameter to values of another
parameter. i.e. default cache database url to database_connection.
Metadata for this was placed in code. Should probably be
placed eventually in config file which would require a change
to the config_schema.yml.

Introduced a new unit test, test_citations_db.py
Did not introduce a new test for the biotools_service beaker cache
because I could not find a place in the code where it was used.
based beaker cache.

Change test-requirements.txt to require v beaker 1.11.0 because latest
version of beaker apparently introduced a bug that broke the ability
to use database as a cache. Also fixed lint errors.
…eaker cache.

mypy kept complaining that module 'galaxy' has no attribute 'config'
on an 'from galaxy import config' statement. So I changed it to
'import galaxy.config'
All 3 beaker caches, mulled_resolution, citations, biotools service,
now use the same default cache table, beaker_cache. This reduces the
number of open db connections because beaker creates a separate
sqlalchemy engine and associated connection pool for each distinct
table and each connection pool opens up a certain number of connections.
This was causing the integration tests to fail because the max
postgress connections(100) were being exceeded.
claudiofr and others added 7 commits June 21, 2023 22:45
Clear the caches in the event the cache type is database. This will
delete any rows in the cache table that could be left over from a
prior run of the app preventing an accumulation of stale data.
The test/unit/app directory tree has a test-requirements.txt file that
brings in dependencies required by database utility modules imported by
the unit test.
Co-authored-by: Nicola Soranzo <[email protected]>
@mvdbeek mvdbeek changed the title Db beaker Expose additional beaker caching backends Jun 21, 2023
@mvdbeek mvdbeek merged commit ffa8e76 into galaxyproject:dev Jun 21, 2023
@mvdbeek
Copy link
Member

mvdbeek commented Jun 21, 2023

Thanks again @claudiofr, this is great work!

@nuwang
Copy link
Member

nuwang commented Feb 14, 2024

@claudiofr
Copy link
Contributor Author

claudiofr commented Feb 14, 2024 via email

@nuwang
Copy link
Member

nuwang commented Feb 14, 2024

Thanks for the clarification. Will discuss this with the Systems SIG and see how we can handle chart defaults going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support sqla / database backend for beaker
6 participants