-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allowing multiple pack storage locations #123
Comments
Proof of concept PR #126 Pinning @giovannipizzi @chrisjsewell |
After discussion with @zhubonan and @chrisjsewell the following design could be envisaged:
As a power user, I can then create folders inside archived-packs and mount them from some remote location. In addition, there should be a function to check that all packs are actually there (e.g. to avoid that one of the archived folders is not mounted - and ideally also add the checksum for further validation?). The simple check of file existence should hopefully be fast, and should be done every time you create a new container instance, otherwise an exception is thrown? Finally, it should be easy for the user to archive the packs. E.g. one could have a command |
@giovannipizzi Thanks for the summary! One potential issue I can think of with this is that if the user have multiple profiles and hence multiple repositories, one can potentially make mistakes when mouting the correct folder inside At the moment the packs are stored as numbered files, eg. |
good point, thanks! Either that, or have a JSON in the folder that gives information. But I agree |
After re-discussion with @zhubonan we realized that the logic described here is probably too complex. Probably the easiest is to mount just the |
One problem I face with my current AiiDA-based workflow is the growing size of the repository verses the finite size of the fast SSD storage. This can happen quite quickly if I had to run a few "large" caclulations for which a lot of data is needed during post-processing and provenance critical . In theory, most of the files stored by AiiDA are not frequently accessed and they are perfectly fine to sit on a slow storage position, e.g. spinning disk or NFS mounts. On the other hand, having the whole repository on a slow storage location can slow down the daemon and workflows.
I think this package can give a natural solution to this problem. Here, the loose "objects" can be written onto a fast-to-write disk. The read-only access of the "fully" packed packs no longer benefit from fast disk speed, so they can be moved into a slow storage if needed, e.g:
At the moment, all of the (integer numbers) packs are stored under the
packs
folder, would it be possible to allow multiple storage positions to be used (for fully "packed" ones)? I think it should just be a matter of iterating over the storage locations and check if the file exists, or a dictionary of pack id and their locations can built when theContainer
class is instantiated to reduce the overhead.Please let me know what do you think about thsi idea. Thanks!
The text was updated successfully, but these errors were encountered: