Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid time-consuming blocking operations in Librarian server #17

Open
pkgw opened this issue Jun 8, 2016 · 1 comment
Open

Avoid time-consuming blocking operations in Librarian server #17

pkgw opened this issue Jun 8, 2016 · 1 comment

Comments

@pkgw
Copy link
Contributor

pkgw commented Jun 8, 2016

The Librarian server defaults to using Tornado so that it can deal with lots of client connections simultaneously. However, the Librarian still performs various long-running operations in the main thread, which means that it can block for a long time and be a nonresponsive server.

The main culprit is the MD5 sum calculation, which can be slow. It's not actually done on the server, but the server sits around waiting for the result to come out of its SSH connection to the pot. Ideally, any server code that calls Store.get_info_for_path or File.get_inferring_info should use a background thread, or nonblocking I/O magic, to avoid blocking. This would involve converting them to return some kind of promise or something, and I believe it would require using Tornado's HTTP framework rather than Flask's to allow asynchronous responses.

In principle we should also be careful about anything that SSHs into a store. Under the current scheme it would probably be a lot of work to async-ize all of those bits of code, though, and in most cases the calls (mv, mkdir, etc.) should be near-instantaneous.

@pkgw
Copy link
Contributor Author

pkgw commented Dec 20, 2016

The more I look into this the more it seems like this will be very difficult to do without switching to Python 3 and using its asynchronous features. There does not, however, appear to be a clear recommended way to use SQLAlchemy in an asynchronous way. People have wrappers: e.g, arstecnica.sqlalchemy.async or sAsync. But none of them are officially blessed and one suspect there's a reason that the main package hasn't attempted this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant