Not retrieving full transcripts from GFF3, only coding segments (this is without the -C parameter) #140

vaneet-lotay · 2024-11-06T22:21:37Z

Hello,

I'm not sure if I'm missing something obvious or using the wrong parameters for gffread but when I run it to extract transcript sequences from a GFF3 file it doesn't seem to extract the full transcript from the start to the stop coordinate. I examined a few sequences from the output and as best as I can tell it might just be extracting the CDS segments and not the introns in between, so perhaps the coding sequences but that's not what I want. Here's the command I use (example filenames):

gffread -w transcripts.fa -g genomic_seq.fa gene_models.gff3

In the transcripts.fa file I noticed that the sequences are not the complete transcripts including introns, is there a particular set of parameters that will help me get that output? For example here's the mRNA/transcript line from the GFF3:

Chr1 Xenbase mRNA 329594 331864 . - . ID=mRNA099831;Name=XM_031895097.1;Dbxref=GeneID:116408318,Genbank:XM_031895097.1;Parent=XBXT10g022928;gbkey=mRNA;gene=bbc3;transcript_id=RefSeq:XM_031895097.1;curie=RefSeq:XM_031895097.1;Ontology_term=SO:0000234

I thought I should expect that the sequence length should be approximately the length between that start and end coordinate, unless I'm interpreting this wrongly?

Most of the transcripts I'm dealing with in the GFF3 have both CDS and exon segments that overlap identically except at the start and end of the transcripts since those act as 'implied UTRs'.

Any help you can provide would be appreciated, thanks!

Vaneet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not retrieving full transcripts from GFF3, only coding segments (this is without the -C parameter) #140

Not retrieving full transcripts from GFF3, only coding segments (this is without the -C parameter) #140

vaneet-lotay commented Nov 6, 2024

Not retrieving full transcripts from GFF3, only coding segments (this is without the -C parameter) #140

Not retrieving full transcripts from GFF3, only coding segments (this is without the -C parameter) #140

Comments

vaneet-lotay commented Nov 6, 2024