Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

only None copy numbers in the output file #64

Open
maryamghr opened this issue Aug 29, 2023 · 5 comments
Open

only None copy numbers in the output file #64

maryamghr opened this issue Aug 29, 2023 · 5 comments

Comments

@maryamghr
Copy link

I tried adVNTR on PacBio data and the output bed file has only None values in R1 and R2 columns. It basically does not genotype any tandem repeat region.

@sara-javadzadeh
Copy link
Collaborator

Hi Maryam,

Thanks for bringing this up. This is not the expected behavior for AdVNTR genotype command on PacBio data. So I expect that there was an error. Could you please share the following?

  1. What is the exact advntr command you used (including all the flags)?
  2. Do you see any error messages in the .log file? If so, could you please share them?
  3. Could you please share a bit about the input data? Is it mapped reads? What is the fold coverage? Did you confirm that the dataset includes spanning reads for the targeted VNTR loci?

Sara

@maryamghr
Copy link
Author

Hi Sara,

Thanks for following up.

  1. This is the advntr command that I run:
advntr genotype --alignment_file $my_pacbio_bam_file --working_directory log_dir 
        --pacbio -m vntr_data/hg38_selected_VNTRs_Illumina.db 
        --outfmt bed -t 30 
  1. There is no error message, but the main lines in the log file are like this for each vntr region:
INFO:extract_unmapped_reads_to_fasta_file executed in 0.000037s
INFO:get_filtered_read_ids executed in 0.747395s
INFO:get_vntr_filtered_reads_map executed in 0.747683s
DEBUG:finding repeat count from pacbio alignment file for 201
INFO:length_distribution of unmapped spanning reads: []
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
...
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
INFO:length_distribution of mapped spanning reads: []
INFO:get_spanning_reads_of_aligned_pacbio_reads executed in 0.171241s
INFO:There is no spanning read
INFO:find_repeat_count_from_pacbio_alignment_file executed in 0.208687s
  1. I am using HG002 bam file (PacBio CLR reads aligned with minimap2) from giab. It is a whole-genome alignment file, and it has indeed coverage in VNTR regions. The coverage in the first VNTR region is ~40x for example.

Bests,
Maryam

@sara-javadzadeh
Copy link
Collaborator

sara-javadzadeh commented Sep 3, 2023 via email

@maryamghr
Copy link
Author

Salam Sara,

Thank you for working on it!
I would like to ask another question in the meantime. Is there any way to use advntr to genotype vntrs given a custom set of regions that are given as a bed file (with an optional column specifying the motif)?
So far, I only saw the option for adding a custom vntr region, which needs genomic coordinates of a single region. I want to run it on a bed file including my regions of interest (if possible).

Bests,
Maryam

@Xiaodiao1111
Copy link

Hi Sara,

Thanks for following up.

  1. This is the advntr command that I run:
advntr genotype --alignment_file $my_pacbio_bam_file --working_directory log_dir 
        --pacbio -m vntr_data/hg38_selected_VNTRs_Illumina.db 
        --outfmt bed -t 30 
  1. There is no error message, but the main lines in the log file are like this for each vntr region:
INFO:extract_unmapped_reads_to_fasta_file executed in 0.000037s
INFO:get_filtered_read_ids executed in 0.747395s
INFO:get_vntr_filtered_reads_map executed in 0.747683s
DEBUG:finding repeat count from pacbio alignment file for 201
INFO:length_distribution of unmapped spanning reads: []
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
...
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
INFO:length_distribution of mapped spanning reads: []
INFO:get_spanning_reads_of_aligned_pacbio_reads executed in 0.171241s
INFO:There is no spanning read
INFO:find_repeat_count_from_pacbio_alignment_file executed in 0.208687s
  1. I am using HG002 bam file (PacBio CLR reads aligned with minimap2) from giab. It is a whole-genome alignment file, and it has indeed coverage in VNTR regions. The coverage in the first VNTR region is ~40x for example.

Bests, Maryam

Hello, have you solved your problem? How to solve it. I had a similar problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants