Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide a small dataset for quick code tests #46

Closed
2 tasks done
fmalmeida opened this issue Jan 6, 2022 · 9 comments
Closed
2 tasks done

provide a small dataset for quick code tests #46

fmalmeida opened this issue Jan 6, 2022 · 9 comments

Comments

@fmalmeida
Copy link
Owner

fmalmeida commented Jan 6, 2022

Provide a small dataset to quickly check the code integrity.

The pipeline already has a test dataset that enables the checkup of the majority of processes with -profile test. However, it is not too small and it takes generally ~1-2h to finish.

So, when updating code, we generally want something quicker just to check its integrity. Therefore, it would be nice to have a super small dataset that enables this in less then 40 min.

Task:

  • provide a small dataset for quick testing the code integrity, while trying to execute most of processes
  • enable this with -profile smaller-test
@abhi18av
Copy link
Contributor

Agreed, this would be super helpful even on Github CI/Actions tests.

@fmalmeida
Copy link
Owner Author

Found out that a run with Haemophilus influenzae genome takes only 9 min to finish testing almost all modules, just not running the assembly modules neither the methylation calling module. But since the pipeline is properly compiled to run all the others successfully, it seems like a good dataset for quick testing the code integrity and also the modules executed.

Now, just needs to add this new test profile in develop and bring a new patch release.

@fmalmeida
Copy link
Owner Author

Has been added in develop by commit: ed6f51a

But the urls are already pointing to the master inside these configs, thus, they will only be useful when this branch is merged into the master.

After that, I'd have to understand a little bit more about github actions to make these check-ups automatic.

@abhi18av
Copy link
Contributor

This sounds great @fmalmeida - do we have a tentative release date for the next release (or merge) ?

@fmalmeida
Copy link
Owner Author

fmalmeida commented Jan 20, 2022

Hi @abhi18av,

For the bigger release, which is related to issue #36 and the draft PR #44 I don't have yet a forecast, because I am not finding too much time to spent with these implementations now that the pipeline is stable. You can see that the "remodelling" branch implementation (related to issue #36) is super slow and may take a good amount of time to be finished.

However, for the smaller changes, the ones that were addressed in the issues you've contributed and are already merged in the develop ... From my latest tests, I've seen that the branch seems to be already stable. I just want to test it two more times with a few datasets of mine before bringing it forward.

For these smaller ones, I think that a new release 3.0.1 could be published within the two next weeks.

😄

@abhi18av
Copy link
Contributor

Ah, okay this makes sense Felipe, not a problem - time is limited resource 😉

But thinking further about this, I think that maybe an nf-core/modules like approach for testing (pytest) might make more sense since we now have a smaller dataset.

This would ideally be done in conjunction with the possible refactoring of modules.

But no hurries, I think sometimes more than engineering, a pipeline (or product) needs more users to guide the overall development :)

@fmalmeida
Copy link
Owner Author

fmalmeida commented Jan 20, 2022

I don't know much about the pytest that nf-core executes in their pipelines/modules. But I think that it is worthy to learn anything that would make testing easier.

If you could point me out to such examples and how they are done or configured so I can try to learn more about them, it would be nice.

No worries, every input is valuable, and I am pleased about the discussions and inputs you've brought to me 😄

And yes, I agree with you, this automatic testing implementations may be done in conjunction with the refactoring of modules 😄

@abhi18av
Copy link
Contributor

If you could point me out to such examples and how they are done or configured so I can try to learn more about them, it would be nice.

Sure, Felipe 👍

Actually, I saw this practice initiated by the nf-core folks, this is documented in the talks here

Beyond nf-core, an independent effort by Robert was done for bactopia https://github.com/bactopia/bactopia who relied on the data here https://github.com/bactopia/bactopia-tests

However if you have other ideas in mind, I'd be happy to discuss and try it out with you :)

@fmalmeida
Copy link
Owner Author

Hi @abhi18av,

Many thanks for pointing out these sources. I will surely make some efforts to read and study them.

Having knowledge on how to speed-up and automatize tests will be awesome for this repository in specific and also for my future works.

Thank you.

😁😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants