Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-scaffolding using agp file #100

Open
amara86 opened this issue Dec 2, 2024 · 1 comment
Open

re-scaffolding using agp file #100

amara86 opened this issue Dec 2, 2024 · 1 comment

Comments

@amara86
Copy link

amara86 commented Dec 2, 2024

Hi,
I am working on scaffolding my genome assembly (estimated genome size: 516 MB) using YaHS. After several rounds of scaffolding, I observed the following in the log:
[I::inter_link_norms] using new radius 1 (22) as noise ratio exceeds threshold 0.100
[I::run_yahs] scaffolding round 10 done
[I::print_asm_stats] assembly stats:
[I::print_asm_stats] N50: 17227518 (n = 10)
[I::print_asm_stats] N90: 8338583 (n = 24)
[I::run_yahs] assembly N50 (17227518) too small. End of scaffolding.
[I::run_yahs] consider running with increased memory limit if there was a memory issue.
[I::main] writing FASTA file for scaffolds
To continue improving the assembly, I used the last AGP file from a previous round (e.g., yahs.out_r05.agp) as the starting point and re-executed YaHS for additional rounds of scaffolding. While this iterative approach improves the N50 and other metrics, each round appears to start again from the smallest resolution. My question is: does this workflow of repeated re-scaffolding using the AGP file make logical sense? I still have a high noise in my data when I visualised hic file in juicebox.
Regards,
Amara

@c-zhou
Copy link
Owner

c-zhou commented Dec 9, 2024

Hi @amara86,

We have observed improvements in some cases when scaffolding multiple rounds, primarily because additional assembly errors are corrected in subsequent rounds. However, we have not performed systematic experiments to thoroughly evaluate this. Whether it is worthwhile for your genome may depend on the assembly and HiC data qualities. You may check the karyotype of your genome to assess whether it has benefited from additional scaffolding.

Best,
Chenxi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants