-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mapFromAlignments() and soft-clipping #12
Comments
Hi Herve,
|
The issue is at GenomicAlignments/src/coordinate_mapping_methods.c Lines 169 to 223 in 4237abd
In particular GenomicAlignments/src/coordinate_mapping_methods.c Lines 198 to 201 in 4237abd
Although To demonstrate, if you remove the clipping from the alignment, everything looks fine
If you increase the clipping size, even the last 2 positions are wrong:
Also, soft clipping on the other side is also an issue:
One way to fix this is to change this function ( |
Hi @gaoce , Thanks for taking the time to look at the code. Would you be willing to submit a PR? Problem is that I'm not familiar with the implementation of Best, |
Hi @hpages, Sure. I will find some time to work on it. |
I've bumped in the same problem identified by @hpages. I'd like to point out some further subtleties in mapFrom and mapTo with soft/hard clips. While the genome coordinates are uniquely defined for an alignment, the coordinates within the alignement itself can be defined in several ways: within the mapped part of the read without clipped parts, or counting the soft (or hard) clipped parts also. Note, that both options can be useful depending on the question asked. For example, for soft clipped reads the whole sequence (and qualities) is stored in the record, so indexing into it requires the coordinates including the clipped parts. In case of hard clipped alignments usually discarding the clipped parts is sensible, but for liftover like tasks, when alignment is not a read but an alternative shifted genome, it is required to get coordinates in the alignment including hard clips (this is the case with minimap2 assembly to assembly mapping). For example, for hard clipped alignment
And for the soft clipped case
To add to the confusion, in the current implementation I think that a good option is to add an additional parameter |
So, the suggested behaviour should be like this
Suggested mapFromAlignments behaviourIn this case "mapped" is the default if called without style parameter
Suggested mapToAlignments behaviourIn this case "sequence" is the default if called without style parameter
|
This looks wrong:
The reported deletion (ref pos 6 & 7) seems wrong. Should be ref pos 4 & 5.
sessionInfo():
The text was updated successfully, but these errors were encountered: