You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In your paper you report that RepeatModeler2 has a low number of false positives.
I am wondering, however, if repeats with a small length are more likely false positives than larger ones?
In my analysis, I obtained 464 repeats, of which about 10% are below 100bp and almost 50% are below 500bp (min = 56bp, max = 17331bp, average = 1131 bp).
Would you recommend to filter the identified repeat sequences for a minimum length?
The text was updated successfully, but these errors were encountered:
Sorry for the long delay. It is hard to say from size alone. It really depends on the organism, the classes of TEs etc. In many cases shorter sequences may simply be fragments of true, but much longer families. In curating a de-novo generated library we typically take the longer sequences first and then, after curation ( ie. extension ) we compare the smaller fragments against the curated library to see if we can discard duplicated results or identify subfamilies. The remaining set are then extended ( if possible ) and a final library is generated.
In your paper you report that RepeatModeler2 has a low number of false positives.
I am wondering, however, if repeats with a small length are more likely false positives than larger ones?
In my analysis, I obtained 464 repeats, of which about 10% are below 100bp and almost 50% are below 500bp (min = 56bp, max = 17331bp, average = 1131 bp).
Would you recommend to filter the identified repeat sequences for a minimum length?
The text was updated successfully, but these errors were encountered: