Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not retrieving full transcripts from GFF3, only coding segments (this is without the -C parameter) #140

Open
vaneet-lotay opened this issue Nov 6, 2024 · 0 comments

Comments

@vaneet-lotay
Copy link

Hello,

I'm not sure if I'm missing something obvious or using the wrong parameters for gffread but when I run it to extract transcript sequences from a GFF3 file it doesn't seem to extract the full transcript from the start to the stop coordinate. I examined a few sequences from the output and as best as I can tell it might just be extracting the CDS segments and not the introns in between, so perhaps the coding sequences but that's not what I want. Here's the command I use (example filenames):

gffread -w transcripts.fa -g genomic_seq.fa gene_models.gff3

In the transcripts.fa file I noticed that the sequences are not the complete transcripts including introns, is there a particular set of parameters that will help me get that output? For example here's the mRNA/transcript line from the GFF3:

Chr1 Xenbase mRNA 329594 331864 . - . ID=mRNA099831;Name=XM_031895097.1;Dbxref=GeneID:116408318,Genbank:XM_031895097.1;Parent=XBXT10g022928;gbkey=mRNA;gene=bbc3;transcript_id=RefSeq:XM_031895097.1;curie=RefSeq:XM_031895097.1;Ontology_term=SO:0000234

I thought I should expect that the sequence length should be approximately the length between that start and end coordinate, unless I'm interpreting this wrongly?

Most of the transcripts I'm dealing with in the GFF3 have both CDS and exon segments that overlap identically except at the start and end of the transcripts since those act as 'implied UTRs'.

Any help you can provide would be appreciated, thanks!

Vaneet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant