Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

combineRMAlignOutput process not working #3

Open
manighanipoor opened this issue Sep 26, 2024 · 3 comments
Open

combineRMAlignOutput process not working #3

manighanipoor opened this issue Sep 26, 2024 · 3 comments

Comments

@manighanipoor
Copy link

Hi
I am running RepeatMaker nextflow using a custom TE library on a genome assembly. All processes worked successfully unless the last process (combineRMAlignOutput) with the following error (I got the same error on runs of RepeatMasker Nextflow on other genomes as well):

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

    Pipeline execution summary
    ---------------------------

executor > local (1), slurm (115)
[89/6d4e0a] process > warmupRepeatMasker (1) [100%] 1 of 1 ✔
[c6/11abd9] process > genBatches (1) [100%] 1 of 1 ✔
[cb/75fcc5] process > RepeatMasker (31) [100%] 38 of 38 ✔
[fe/728d24] process > combineRMOUTOutput (20) [100%] 38 of 38 ✔
[45/0852f6] process > combineRMAlignOutput (38) [ 5%] 2 of 38, failed: 2
Error executing process > 'combineRMAlignOutput (1)'

Caused by:
Process combineRMAlignOutput (1) terminated with an error exit status (25)

Command executed:

for f in batch-38.fa.align; do cat $f >> combAlign; done ####/hpcfs/users/a1177955/local/RepeatMasker_Nextflow/alignToBed.pl -fullAlign combAlign | /hpcfs/users/a1177955/local/UCSC_tools/bedSort stdin stdout | /hpcfs/users/a1177955/local/RepeatMasker_Nextflow/bedToAlign.pl > combAlign-sorted
/hpcfs/users/a1177955/local/RepeatMasker_Nextflow/alignToBed.pl -fullAlign combAlign > tmp.bed

Be mindful of this buffer size...should probably make this a parameter

sort -k1,1V -k2,2n -k3,3nr -S 3G -T /hpcfs/users/a1177955/HTT_sea_snake/synteny/repeatmasker/hydrophis_cyanocinctus/work tmp.bed > tmp.bed.sorted
/hpcfs/users/a1177955/local/RepeatMasker_Nextflow/bedToAlign.pl tmp.bed.sorted > combAlign-sorted
/hpcfs/users/a1177955/local/RepeatMasker_Nextflow/renumberIDs.pl -translation combOutSorted-translation.tsv combAlign-sorted > combAlign-sorted-renumbered
gzip -c combAlign-sorted-renumbered > hydrophis_cyanocinctus.rmalign.gz

Command exit status:
25

Command output:
(empty)

Command error:
Could not find translation for ID: b38_1
Found in line: 1336 20.67 11.67 0.00 chr4 201472739 201473038 (19427034) C rnd-5_family-5278_s_2#LINE/L1 (1484) 9915 9581 b38_1 m_b1s001i0

Work dir:
/hpcfs/users/a1177955/HTT_sea_snake/synteny/repeatmasker/hydrophis_cyanocinctus/work/49/389484c8e39378c87c46d24f0899c6

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

WARN: Killing running tasks (36)

As I said the issue replicates in other runs whole i did not have the issue in your sample test data

Cheers,
Mani

@manighanipoor
Copy link
Author

I realized that the combOutSorted-translation.tsv file and as a result, .rmout file only contains data for one batch and other batches are missing.

@rmhubley
Copy link
Member

rmhubley commented Dec 6, 2024

I am not sure what buffer size you are referring to. Perhaps the sort temporary directory (-T)? In any case, I just pushed an update for DSL2, and in this version the batches are no longer combined using a "for" loop and are now concatenated using the "combine" operator.

@manighanipoor
Copy link
Author

Thanks Robert,

I have run your updated version in my university HPC:

nextflow run /hpcfs/users/a1177955/local/RepeatMasker_Nextflow/RepeatMasker_Nextflow.nf --inputSequence "$PWD/${OUTGENOME}.fasta" --outputDir "$PWD" --inputLibrary "$PWD/${HTTNAME}.fasta" --cluster hpc1 --batchSize 500000000

But it failed with:

-- Check '.nextflow.log' file for details

executor > local (1), slurm (6)
[18/de3cab] process > warmupRepeatMasker [100%] 1 of 1 ✔
[a8/1faa0e] process > genTwoBitFile [100%] 1 of 1 ✔
[9b/f896d2] process > genBatches [100%] 1 of 1 ✔
[1d/5f0041] process > RepeatMasker (3) [ 50%] 2 of 4, failed: 1
[- ] process > combineRMOUTOutput -
[- ] process > combineRMAlignOutput -
ERROR ~ Error executing process > 'RepeatMasker (1)'

Caused by:
Process RepeatMasker (1) terminated with an error exit status (140)

Command executed:

Run RepeatMasker and readjust coordinates

/hpcfs/users/a1177955/local/UCSC_tools/twoBitToFa -bed=batch-1.bed GCA_030407125.1.2bit batch-1.fa
/gpfs/apps/icl/software/RepeatMasker/4.1.5-foss-2021b//RepeatMasker -a -engine rmblast -pa 3 -lib Hero-1_A.laev.fasta batch-1.fa >& batch-1.rmlog
export REPEATMASKER_DIR=/gpfs/apps/icl/software/RepeatMasker/4.1.5-foss-2021b/
/hpcfs/users/a1177955/local/RepeatMasker_Nextflow/adjCoordinates.pl batch-1.bed batch-1.fa.out
/hpcfs/users/a1177955/local/RepeatMasker_Nextflow/adjCoordinates.pl batch-1.bed batch-1.fa.align
cp batch-1.fa.out batch-1.fa.out.unadjusted
mv batch-1.fa.out.adjusted batch-1.fa.out
mv batch-1.fa.align.adjusted batch-1.fa.align

Command exit status:
140

Command output:
(empty)

Work dir:
/scratchdata1/users/a1177955/HTT_sea_snake/aipysurus_laevis/HTT_CARP/KaKs_selection/Hero-1_new/work/8b/e2a5557dd2ca9e0fc45d643c16fe2f

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

-- Check '.nextflow.log' file for details

I could not figure out the issue.

Cheers,
Mani

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants