Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

empty tmpConsensi.fa #97

Open
MikiSchikora opened this issue Aug 6, 2020 · 3 comments
Open

empty tmpConsensi.fa #97

MikiSchikora opened this issue Aug 6, 2020 · 3 comments

Comments

@MikiSchikora
Copy link

Hello,

I am trying to run RepeatModeler 2.0.1 (installed with conda) for two different databases with the following command:

RepeatModeler -database <db_name> -pa 4 -LTRStruct -debug , where <db_name> can be either chr6.fasta (just one chromosome) or complete_genome.fasta (the whole genome).

RepeatModeler works well for the whole genome, but fails for chr6.fasta displays the following message:

RepeatClassifier Version 2.0.1
======================================
Search Engine = rmblast
  - Looking for Simple and Low Complexity sequences..
FastaDB::compact - Error could not locate file tmpConsensi.fa.masked!
 at /data/anaconda3_cluster/anaconda3/envs/perSVade_env/share/RepeatModeler/RepeatClassifier line 333.
Classification Time: 00:00:00 (hh:mm:ss) Elapsed Time

In fact, the files tmpConsensi.fa.masked and tmpConsensi.fa exist, but they are empty, which probably raises this error. In addition, the final chr6.fasta-families.fa is not created. However, the files consensi.fa and combined.fa exist and contain repeat families, suggesting that I should be getting a non-empty output.

I wonder whether 1) this is a bug in the program, or 2) I just have to run RepeatClassifier on the families found.

If 2), should I run RepeatClassifier on consensi.fa or combined.fa?

Would this be a solution to any situation where there are errors in RepeatClassifier due to empty 'tmpConsensi.fa'?

I also attach the full log for clarity:

chr6_repeatModeler_out.txt

Thanks for your time.

@jebrosen
Copy link
Member

jebrosen commented Aug 7, 2020

In fact, the files tmpConsensi.fa.masked and tmpConsensi.fa exist, but they are empty, which probably raises this error.

That does seem wrong; tmpConsensi.fa is supposed to simply be a copy of one of the consensus files. What happens if you re-run RepeatClassifier manually?

However, the files consensi.fa and combined.fa exist and contain repeat families, suggesting that I should be getting a non-empty output.
should I run RepeatClassifier on consensi.fa or combined.fa?

How many families are in each? Since there are only a few files, it should be easy to run RepeatClassifier on both and compare.

@leon945945
Copy link

leon945945 commented Oct 17, 2020

Hi, the RepeatModeler program almost done, but bug was arised as following:

something went wrong with the TRFMask program. The tmpConsensi.fa.masked file was missing or empty!

The consensi.fa.classified file was not generated, what's wrong with the TRFMask program and how can i fix it. Thanks!

@jebrosen
Copy link
Member

@leon945945 that error message is from an older version of RepeatModeler; in fact, that exact part of the code has been changed as part of some cleanup and fixing on RepeatClassifier last year. I would first try downloading the latest RepeatModeler in a new directory and running the new version of RepeatClassifier on your consensi.fa file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants