only None copy numbers in the output file #64

maryamghr · 2023-08-29T13:12:44Z

I tried adVNTR on PacBio data and the output bed file has only None values in R1 and R2 columns. It basically does not genotype any tandem repeat region.

sara-javadzadeh · 2023-08-30T18:12:05Z

Hi Maryam,

Thanks for bringing this up. This is not the expected behavior for AdVNTR genotype command on PacBio data. So I expect that there was an error. Could you please share the following?

What is the exact advntr command you used (including all the flags)?
Do you see any error messages in the .log file? If so, could you please share them?
Could you please share a bit about the input data? Is it mapped reads? What is the fold coverage? Did you confirm that the dataset includes spanning reads for the targeted VNTR loci?

Sara

maryamghr · 2023-08-31T09:30:47Z

Hi Sara,

Thanks for following up.

This is the advntr command that I run:

advntr genotype --alignment_file $my_pacbio_bam_file --working_directory log_dir 
        --pacbio -m vntr_data/hg38_selected_VNTRs_Illumina.db 
        --outfmt bed -t 30

There is no error message, but the main lines in the log file are like this for each vntr region:

INFO:extract_unmapped_reads_to_fasta_file executed in 0.000037s
INFO:get_filtered_read_ids executed in 0.747395s
INFO:get_vntr_filtered_reads_map executed in 0.747683s
DEBUG:finding repeat count from pacbio alignment file for 201
INFO:length_distribution of unmapped spanning reads: []
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
...
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
INFO:length_distribution of mapped spanning reads: []
INFO:get_spanning_reads_of_aligned_pacbio_reads executed in 0.171241s
INFO:There is no spanning read
INFO:find_repeat_count_from_pacbio_alignment_file executed in 0.208687s

I am using HG002 bam file (PacBio CLR reads aligned with minimap2) from giab. It is a whole-genome alignment file, and it has indeed coverage in VNTR regions. The coverage in the first VNTR region is ~40x for example.

Bests,
Maryam

sara-javadzadeh · 2023-09-03T17:39:11Z

Hi Maryam, Thanks for sharing the info. I'm working on it. Sara

…

On Thu, Aug 31, 2023 at 2:31 AM maryamghr ***@***.***> wrote: Hi Sara, Thanks for following up. 1. This is the advntr command that I run: advntr genotype --alignment_file $my_pacbio_bam_file --working_directory log_dir --pacbio -m vntr_data/hg38_selected_VNTRs_Illumina.db --outfmt bed -t 30 1. There is no error message, but the main lines in the log file are like this for each vntr region: INFO:extract_unmapped_reads_to_fasta_file executed in 0.000037s INFO:get_filtered_read_ids executed in 0.747395s INFO:get_vntr_filtered_reads_map executed in 0.747683s DEBUG:finding repeat count from pacbio alignment file for 201 INFO:length_distribution of unmapped spanning reads: [] DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read ... DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read INFO:length_distribution of mapped spanning reads: [] INFO:get_spanning_reads_of_aligned_pacbio_reads executed in 0.171241s INFO:There is no spanning read INFO:find_repeat_count_from_pacbio_alignment_file executed in 0.208687s 1. I am using HG002 bam file (PacBio CLR reads aligned with minimap2) from giab. It is a whole-genome alignment file, and it has indeed coverage in VNTR regions. The coverage in the first VNTR region is ~40x for example. Bests, Maryam — Reply to this email directly, view it on GitHub <https://urldefense.com/v3/__https://github.com/mehrdadbakhtiari/adVNTR/issues/64*issuecomment-1700691483__;Iw!!Mih3wA!GwYl_Njdo_6wGBk3fh63y-0WAQfrh4_LOHiK8OXssSPQSSJTRy1fplmquGBMky1C-U1JGCiXFgkrvVWYFUNnQL18bTXcAT9g$>, or unsubscribe <https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AOGKYDXKQND3SUUJ7EMTWQ3XYBKVDANCNFSM6AAAAAA4C6AUQE__;!!Mih3wA!GwYl_Njdo_6wGBk3fh63y-0WAQfrh4_LOHiK8OXssSPQSSJTRy1fplmquGBMky1C-U1JGCiXFgkrvVWYFUNnQL18bUdHFvUF$> . You are receiving this because you commented.Message ID: ***@***.***>

maryamghr · 2023-09-04T14:02:27Z

Salam Sara,

Thank you for working on it!
I would like to ask another question in the meantime. Is there any way to use advntr to genotype vntrs given a custom set of regions that are given as a bed file (with an optional column specifying the motif)?
So far, I only saw the option for adding a custom vntr region, which needs genomic coordinates of a single region. I want to run it on a bed file including my regions of interest (if possible).

Bests,
Maryam

Xiaodiao1111 · 2023-09-25T00:49:08Z

Hi Sara,

Thanks for following up.

This is the advntr command that I run:
advntr genotype --alignment_file $my_pacbio_bam_file --working_directory log_dir 
        --pacbio -m vntr_data/hg38_selected_VNTRs_Illumina.db 
        --outfmt bed -t 30 
There is no error message, but the main lines in the log file are like this for each vntr region:
INFO:extract_unmapped_reads_to_fasta_file executed in 0.000037s
INFO:get_filtered_read_ids executed in 0.747395s
INFO:get_vntr_filtered_reads_map executed in 0.747683s
DEBUG:finding repeat count from pacbio alignment file for 201
INFO:length_distribution of unmapped spanning reads: []
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
...
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
INFO:length_distribution of mapped spanning reads: []
INFO:get_spanning_reads_of_aligned_pacbio_reads executed in 0.171241s
INFO:There is no spanning read
INFO:find_repeat_count_from_pacbio_alignment_file executed in 0.208687s
I am using HG002 bam file (PacBio CLR reads aligned with minimap2) from giab. It is a whole-genome alignment file, and it has indeed coverage in VNTR regions. The coverage in the first VNTR region is ~40x for example.

Bests, Maryam

Hello, have you solved your problem? How to solve it. I had a similar problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

only None copy numbers in the output file #64

only None copy numbers in the output file #64

maryamghr commented Aug 29, 2023

sara-javadzadeh commented Aug 30, 2023

maryamghr commented Aug 31, 2023

sara-javadzadeh commented Sep 3, 2023 via email

maryamghr commented Sep 4, 2023

Xiaodiao1111 commented Sep 25, 2023

only None copy numbers in the output file #64

only None copy numbers in the output file #64

Comments

maryamghr commented Aug 29, 2023

sara-javadzadeh commented Aug 30, 2023

maryamghr commented Aug 31, 2023

sara-javadzadeh commented Sep 3, 2023 via email

maryamghr commented Sep 4, 2023

Xiaodiao1111 commented Sep 25, 2023