You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not exactly sure what is happening here, but it seems wrong and is easily reproducible. It appears that when the same as reference hgvs syntax is used as shown below, that you can put a position in that is greater than the length of the sequence and it will still return a result with a state of state: { type="LiteralSequenceExprssion",value=""}. You can test it with the following expression...
NM_000412.5:c.1930=
The transcript NM_000412.5 is 1947 residues long, but its CDS region is19..1596, so I'm assuming the c.1929 is really be reference position 1929+(19-1) or 1947 thus pointing to the last residue in the sequence. So, I would assume anything greater than c.1929 would fail. But as I write this out I'm thinking that c.1929 should really fail since the coding sequence is really goes from the ref seq residue 19 through 1596. This means the c. positions would go from c.1 = 19 to c.1577 = (1596 - 19). If this tracks then I would assume any c. position greater than c.1577 should fail. Of course, if someone used the hgvs positional syntax to reference sequences further into the 3`utr region like c.*200 which would indicate an additional 200 residues into the 3`utr region past the stop codon of the coding sequence.
In this transcript the last position that is valid using c. nomenclature would be 1947 base ref seq len - 1596 last cds position = 351 or NM_000412.5:c.*351=.
The text was updated successfully, but these errors were encountered:
@larrybabb Good catch. I always forget about CDS start site with coding DNA. So when performing index checks, I forgot to add CDS start site to the position on c. coordinate types.
I'm not exactly sure what is happening here, but it seems wrong and is easily reproducible. It appears that when the
same as reference
hgvs syntax is used as shown below, that you can put a position in that is greater than the length of the sequence and it will still return a result with a state ofstate: { type="LiteralSequenceExprssion",value=""}
. You can test it with the following expression...NM_000412.5:c.1930=
The transcript
NM_000412.5
is1947
residues long, but its CDS region is19..1596
, so I'm assuming thec.1929
is really be reference position1929+(19-1)
or1947
thus pointing to the last residue in the sequence. So, I would assume anything greater than c.1929 would fail. But as I write this out I'm thinking thatc.1929
should really fail since the coding sequence is really goes from the ref seq residue19
through1596
. This means the c. positions would go fromc.1 = 19
toc.1577 = (1596 - 19)
. If this tracks then I would assume any c. position greater thanc.1577
should fail. Of course, if someone used the hgvs positional syntax to reference sequences further into the 3`utr region likec.*200
which would indicate an additional 200 residues into the 3`utr region past the stop codon of the coding sequence.In this transcript the last position that is valid using c. nomenclature would be
1947 base ref seq len - 1596 last cds position = 351
orNM_000412.5:c.*351=
.The text was updated successfully, but these errors were encountered: