-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
re-think modules organization and structure #36
Comments
Adding this here, since the question above would decide whether the following information is relevant or not Line 111 in 0cd2c21
If we continue to use custom modules, then the allocation of module-level parameters could be simplified by using the And then collect all the pipeline parameters within the config file itself, as shown here. |
Hi @abhi18av, Thank you for pointing that out, however, I believe this one is not relevant, because I actually don't use module-level parameters. I only use the parameters that are set globally to the workflow. This was a mistake, I thought that I had to declare the parameters I wanted to use, by I figure I hadn't. These have been removed in the new 3.x version of the pipeline. |
Note on the issue: I will try to address this issue in branch https://github.com/fmalmeida/bacannot/tree/remodeling. Steps:
|
Found a small hindrance while updating the pipeline in the https://github.com/fmalmeida/bacannot/tree/remodeling branch that needs to be addressed before continuing:
Needs to think about this. |
Decided to always use biocontainer images whenever possible. And create a |
Definitely a great step forward! 👍 Might I suggest to cross-publish it on docker and quay.io registeries so that there's a fallback in case of running across docker rate limits. |
What do you mean by cross-publish it? I can upload images to quay.io? |
Oh, nothing fancy - just that you'll need to tag the same image twice. First for pushing to dockerhub and then again for pushing it to quay.io |
Hindrance mentioned in this comment and thought to be solved in this other comment is actually yet to be solved. While implementing changes, we saw that the changes for biocontainers would not grasp the entirety of modules and we would still need two or three different images to allocate other tools and programming languages that would be required. After some thought, we saw that having "half" from biocontainers and "half" from custom docker images would not be the most standardized option ... We then are yet to decide in the dilemma again:
... 🤔 ... |
Hi @fmalmeida , Unless I'm mistaken, these are all of the docker images used in the pipeline right? https://github.com/fmalmeida/bacannot/blob/develop/conf/docker.config
Also, if possible could we discuss this further using some example modules, a bit too abstract for me atm 😅 |
HI @abhi18av, It is nothing too fancy. It is just because some modules, such as the one that renders the rmarkdown reports or the one for the genome browser for instance, have a few dependencies such as scripts or multiple libraries that would not be available inside the biocontainers images which are designed to have only the tool (or conda package) itself. Or even some modules such as digIS which is not available in conda. So, if going forward with changing the ones that could be changed to biocontainers, some modules would still require some non-biocontainer images like the The "problem", is not actually a problem for real, is just a concept and that I am not too much of a fan of doing such mixtures, I like things more standardized 😅 Just to be clear, I am still going to refactor the modules to read the databases from outside the containers as suggested when opening the issue. But, instead of changing everything to biocontainers, I am thinking in:
And yes, these are the docker images used. The point is just that actually, instead of pointing 60%/70% to biocontainers, I am leaning towards creating these custom smaller images, what for me, would be easier to maintain. 😁 |
Hi Felipe, Thanks for sharing the full context! For this, allow me to respond after taking a deeper look at this today and by tomorrow, I'll share my thoughts (if you need 😅 ). Perhaps we can find an optimal strategy for this one. |
Hi Abhinav, Thanks for the interest. For this issue specifically, we have discussed here in the lab and decided to go on as described for now, which is a way that is already established for our local pipelines, thus would follow the standards of the lab and would require less "changes" for the moment being. But surely, I'd like to hear your thoughts, maybe I could use them to find some optimal strategy as you said for future releases or future pipelines. 😄 |
Ah, okay. Yeah, maintenance burden is an important factor 👍 Then, I'll simply just summarize my suggestions here
The second point, could be used in combination with the Hope this helps (in some way 😉 )! |
Many thanks for the suggestions! About the first one, we were already facing some problems with the image sizes, which also helped triggering this issue 😅 And about the second one, I just loved the idea. I didn't know about these options and how useful they could be. Thanks again for these nice resources 😄 |
It would be nice that the pipeline was also capable of running with singularity and conda, instead of only docker, which would be more similar to what is found in nf-core modules. To do that, the following should be addressed:
--databases
These changes, are related and would also affect the following issues making them more possible or to be easier implemented:
The text was updated successfully, but these errors were encountered: