Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SRST2 munges sample identifiers? #267

Open
hexylena opened this issue Oct 28, 2024 · 0 comments
Open

SRST2 munges sample identifiers? #267

hexylena opened this issue Oct 28, 2024 · 0 comments

Comments

@hexylena
Copy link
Contributor

I'm looking at a run of PHAC's SRST2 wrapper:

/srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/nml/srst2/e59fdf6145db/srst2/srst2.pl /data/galaxy/jobs/020/20188/outputs/dataset_cef3ef2e-e010-4768-b36b-1de43d89b28f.dat /data/galaxy/jobs/020/20188/outputs/dataset_582ba0b2-a2eb-4918-a8b2-9f300b6f212a.dat /data/galaxy/jobs/020/20188/outputs/dataset_d6eec7c7-a2c9-4bcd-a919-63b192c9935e.dat  g /data/galaxy/jobs/020/20188/outputs/dataset_14d30ba8-5d8d-4237-a672-243227aac1d7.dat /data/galaxy/jobs/020/20188/outputs/dataset_473b25c9-757c-4a56-8170-df8add14378b.dat \
"ResFinder.fasta,ARGannot_r2.fasta"  \
"SRX6855211_SRR10127028_1.fastq uncompressed" \
--input_pe "/data/galaxy/f/e/1/dataset_fe1245a6-287d-4732-af60-37021f7eaab1.dat" \
"/data/galaxy/5/6/e/dataset_56eedb18-918a-4c26-a5db-f3504dd763c2.dat" \
 --gene_db /data/galaxy/1/6/2/dataset_162c6169-c72f-4f07-a99d-dfd15075ab8e.dat /data/galaxy/5/6/d/dataset_56da7859-6beb-47cd-b701-3b9bcae0e427.dat \
--gene_max_mismatch 250  --read_type q  --save_scores  --other "'-p ${GALAXY_SLOTS:-1}'"  --output ${PWD}/out

(line breaks manually inserted for clarity)

However the output sample table has only part of the sample identifier:

Column 1	Column 2	Column 3	Column 4	Column 5	Column 6	Column 7	Column 8	Column 9	Column 10	Column 11	Column 12	Column 13	Column 14
Sample	DB	gene	allele	coverage	depth	diffs	uncertainty	divergence	length	maxMAF	clusterid	seqid	annotation
SRX6855211	ARGannot_r2	AmpC2_Ecoli_Bla	AmpC2_346	99.735	86.838	22snp3indel		1.94	1134	0.091	99	346	no;no;AmpC2;Bla;CP002970;332756-333889;1134
SRX6855211	ARGannot_r2	MrdA_Bla	MrdA_836	100.0	66.382	25snp		1.314	1902	0.143	16	836	no;no;MrdA;Bla;CP002291;666340-664439;1902
SRX6855211	ARGannot_r2	MphA_MLS	MphA_1663	100.0	69.11			0.0	906	0.096	158	1663	no;no;MphA;MLS;KR091911;890-1795;906
SRX6855211	ARGannot_r2	CTX-M-9_Bla	CTX-M-27_109	100.0	60.304			0.0	876	0.078	190	109	no;no;CTX-M-27;Bla;AY156923;1-876;876

which means I can't match them back up with their input sample once they've gone through hamronize (which I really love.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant