Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unitigs missing details on the lane they are found in #27

Open
rgladstone opened this issue Dec 18, 2023 · 6 comments
Open

Unitigs missing details on the lane they are found in #27

rgladstone opened this issue Dec 18, 2023 · 6 comments

Comments

@rgladstone
Copy link

rgladstone commented Dec 18, 2023

I ran

unitig-caller --call --refs assembly_paths.txt --out ZA_unitigs_v3 --write-graph --threads 32

Version 1.3.0. Where assembly_paths.txt had 1944 assembly paths, and the output had 903111 unitigs

Two unitigs had no lane details:

CCACCTTCCTCCGGTTTGTCACCGGCAGTCAACTTAGAGTGCCCAACTTAATGATGGCAACTAAGCTTAAGGGTTGCGCTCGTTGCGGGACTTAACC |
TTGTTCATAGTTCCATTATAGCAAAAAAAGGGCTCTATAATATTTGTAGTG | 15841_5_29:1

and

TAAAGAAGTCTCCGAAATTCCGCACTGAGCATCTTCTCCGAAAAAGGCCGCTAATGTGGCCTTTTTCTTTACCTGTGGTTCTCCGCCAAAATCCCAGCAAATTGCATCACCAAAGCTAAAAGCTTTCAGGGTTGTCTAAAAAGCACAAGACATAAGAGGAAGTGCGGTATTTTATAATCAAGCCCCCAAGAATTTACCATAACATCCGTTGCCCGCACCGCCTGAGACGCGTTCAGCGCGTTCCTGACGAAACCATGACAAAAACCACAACAAACCACCCCGGAACCCGTCAGAAACGCGCCTGTTAAATTTTAACGGCATGCATGACTATGCACCAGAATGACGCCATGCTCTTTTCACGCAAAAATCATCACCAGACGGGGAAAATCACCAGTGACCAGACAGGAATCCGCCGCCCTCAATATGGCCAAATTTATCCGCGCACAGACACTTCTCCTCCTTGAGCGGCTCGAGCAGATGGATCTGGATGAGGCTGCCGGCTGCTGTGAGCACCTGCACGATCAGGCCGAAGCGCTTTACGCCATGCTGAACGCACAGATAGGCGAGGAAAATGCGTGAAAATCGGTGAACGGGTGCGCAATTCAGTGCGCGGCCGTGAGGCGATGGCGGGGTGTCGGGGCGCAGCCCTGACCAGGGTATTTGTGATGCCGGCGCGTGCGCGGTATTACAAATGCACATCCTGTCCCGGAACGGACACCGGGAAACAGCAAAAAAAACCGGGCGGCACGCCCGGAACTCAATCAAGTTAGATTAGATTACTCTCACTCGTCCATAACAGCATCATGGAACGACGACCACCGTCCGTGACGGCCGCCTCGTTTAAGTATGGACAGAAATACAGAAAATGCTCAGGACGAAATGTAATGAATGCGAACGGATTCAAGAAATTCGAGCATGACAGTCCTTACGGCCGGTTCGGTTTCAGACAAAATCTGCCGGTATGCATCCAGCATCATGGCTCCGGCATCCCCTCCGGCACGCCGTAGCCAGACCGAAACAACGGACACAAGCAGGTGTCGCTCATCATCACTAAGAGTCATCAGGGCTCCGGAAGAAAAACCAAAC |
GAGCACTTTTAATTTGGTGACTTGAGTTATGAGCCAGAATATTTGTTTGACTTGAACTT | 15841_2_75:1

I just wanted to understand why this might be, in case lane IDs are missing from some of the other unitigs seen in more than one isolate.

@samhorsfield96
Copy link
Collaborator

Hi, would you be able to send across the assembly_paths.txt file, please?

@rgladstone
Copy link
Author

assembly_paths.txt
Sure thing, attached now.

@samhorsfield96
Copy link
Collaborator

Hi Rebecca, if you could, would you be able to check the following, please:

  • Are these unitigs present in the .gfa file generated by unitig-caller?
  • Are these unitigs present in any of the source genomes (forward or reverse strand)?

As bifrost can merge k-mers across contig breaks, there is the possibility these unitigs are not present in any source genome.

@rgladstone
Copy link
Author

Yes they are in the .gfa file and not in the assemblies so that makes sense, thanks for clarifying.

@samhorsfield96
Copy link
Collaborator

Hi Rebecca, it seems the issue might be with Bifrost where some k-mers are not annotated with colours (pmelsted/bifrost#73). I would suggest trying with an earlier version of Bifrost (e.g. v1.2.1) and see if this gives the same error.

@rgladstone
Copy link
Author

Sorry for the delay! I have tried a conda env with unitig-caller 1.3.0 and bifrost 1.2.1, and that gave the same results. I noticed there was a fix for #29 so I tried replacing the bifrost.py code within unitig-caller 1.3.0 (bifrost 1.3.1) with the updated code and I still get the same results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants