Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staramr add DM and update tool with database information #6609

Open
wants to merge 46 commits into
base: main
Choose a base branch
from

Conversation

hugolefeuvre
Copy link
Contributor

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

hugolefeuvre and others added 16 commits December 2, 2024 16:47
…9f_plasmidfinder_3e77502/update/plasmidfinder/' content from commit 4add28296

git-subtree-dir: tools/staramr/test-data/resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502/update/plasmidfinder
git-subtree-split: 4add282963c788762cdc3c6e91eb46c3d109f19b
…ramr/test-data/resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502/update/plasmidfinder'
…9f_plasmidfinder_3e77502/update/resfinder/' content from commit e0525f203

git-subtree-dir: tools/staramr/test-data/resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502/update/resfinder
git-subtree-split: e0525f203e43bcd6dc91348e561a2cb09fd4014b
…ramr/test-data/resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502/update/resfinder'
…9f_plasmidfinder_3e77502/update/pointfinder/' content from commit 694919f59

git-subtree-dir: tools/staramr/test-data/resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502/update/pointfinder
git-subtree-split: 694919f59a38980204009e7ade76bf319cb7df0b
…ramr/test-data/resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502/update/pointfinder'
…rch, used only for test with hidden parameter
Copy link
Contributor

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main question is if the pointfinder, resfinder, and plasmidfinder datasets should be handled in three datatables? In the tool they might be linked together into one folder? Depends a bit how how the contents of these datasets looks like.

@@ -0,0 +1,12 @@
name: data_manager_build_staramr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could update the shed.yml file for the data mananger. Currently it reads as for the tool.

Comment on lines +19 to +23
<stdio>
<exit_code range=":-1" level="fatal" description="Error: Cannot open file"/>
<exit_code range="1:" level="fatal" description="Error"/>
</stdio>
<command><![CDATA[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equivalent:

Suggested change
<stdio>
<exit_code range=":-1" level="fatal" description="Error: Cannot open file"/>
<exit_code range="1:" level="fatal" description="Error"/>
</stdio>
<command><![CDATA[
<command detect_errors="exit_code"><![CDATA[

<command><![CDATA[
#import re

#set $match_resfinder = re.search(r'^resfinder_(?:[^_]*_)?([0-9a-z]*)_[0-9]{4}-[0-9]{2}-[0-9]{2}$', str($resfinder.resfinder_database_select))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could simplify parsing by make the values a comma-separated string (then you can just use split). Of course any other character will do as well.

"staramr_database":[
{
"value": "staramr_downloaded_#echo date.today().strftime('%d%m%Y')#_resfinder_${resfinder_commit}_pointfinder_${pointfinder_commit}_plasmidfinder_${plasmidfinder_commit}",
"name": "starAMR databases with ResFinder: ${resfinder_tag_commit_date}, PointFinder: ${pointfinder_tag_commit_date}, PlasmidFinder: ${plasmidfinder_tag_commit_date}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this info shown to the user contain the hash? Maybe do not show it?


#set $named_genomes = ''
#for $genome in $genomes
#set $_named_genome = re.sub(r'(\s|\(|\)|\/)', '_', '"{}.fasta"'.format($genome.element_identifier))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better something like

#set read1_file = re.sub('[^\w\-\.]', '_', str($read1.element_identifier)) + '_forward' + $ext

Comment on lines +772 to +777
+------------+---------------------+---------------+---------+---------+---------+---------+---------+---------+----------+
| Isolate ID | Scheme | Sequence Type | Locus 1 | Locus 2 | Locus 3 | Locus 4 | Locus 5 | Locus 6 | Locus 7 |
+============+=====================+===============+=========+=========+=========+=========+=========+=========+==========+
| SRR1952908 | senterica_achtman_2 | 11 | aroC(5) | dnaN(2) | hemD(3) | hisD(7) | purE(6) | sucA(6) | thrA(11) |
| SRR1952926 | senterica_achtman_2 | 11 | aroC(5) | dnaN(2) | hemD(3) | hisD(7) | purE(6) | sucA(6) | thrA(11) |
+------------+---------------------+---------------+---------+---------+---------+---------+---------+---------+----------+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be a list of column definitions instead of an example. Up to you.

</section>
</inputs>
<outputs>
<data format="tabular" name="mlst" label="${tool.name} on ${on_string}: MLST report" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a few outputs .. maybe make some optional by using filters?

Comment on lines +301 to +310
<conditional name="complex_mutations">
<param name="complex_mutations_condition" type="select" label="Provide a custom list of complex mutations">
<option value="default" selected="true">Use default list of complex mutations</option>
<option value="custom">Provide a custom list of complex mutations</option>
</param>
<when value="default" />
<when value="custom">
<param type="data" name="complex_mutations_file" format="txt,tabular" label="Complex mutations file" help="Pass a file containing a list of complex mutations to group (if present) in the PointFinder results" />
</when>
</conditional>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the conditional one could just use an optional data parameter? Problem with conditionals is that they are fixed in workflows.

Comment on lines +134 to +137
<param argument="--genome-size-lower-bound" type="integer" label="The lower bound for our genome size for the quality metrics" value="4000000" />
<param argument="--genome-size-upper-bound" type="integer" label="The upper bound for our genome size for the quality metrics" value="6000000" />
<param argument="--minimum-N50-value" type="integer" label="The minimum N50 value for the quality metrics" value="10000" />
<param argument="--minimum-contig-length" type="integer" label="The minimum contig length for the quality metrics" value="300" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These integers could also have a min, or? Maybe 0?

]]></command>
<inputs>
<param name="hide_db_build" type="hidden" value="" />
<param type="data" name="genomes" format="fasta" multiple="true"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fasta.gz also possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants