Wrong parse [rt.cpan.org #55629] #12

oalders · 2020-08-24T18:36:46Z

Migrated from rt.cpan.org#55629 (status was 'open')

Requestors:

[email protected]

From [email protected] on 2010-03-16 15:09:51
:

HTML:
<iframe/**/src="http://mail.ru"  name="poc iframe jacking"  width="100%"
height="100%" scrolling="auto" frameborder="no"></iframe>

$parser = HTML::Parser->new(
 api_version => 3,
 start_h => [ sub{
   my ($Self, $Text, $Tag, $Attr) = @_;
   print "Tag is: ".$Tag;
 }, "self, text, tagname, attr" ]
);
$parser->ignore_elements( qw( iframe ));
$parser->ignore_tags( qw( iframe ));

output:
Tag is: iframe/**/src="http://mail.ru"

From [email protected] on 2010-03-18 13:51:31
:

Ð�Ñ�Ñ� Ð�Ð°Ñ� 16 11:09:51 2010, NIKOLAS Ð¿Ð¸Ñ�Ð°Ð»:
> HTML:
> <iframe/**/src="http://mail.ru"  name="poc iframe jacking"  width="100%"
> height="100%" scrolling="auto" frameborder="no"></iframe>
> 
> $parser = HTML::Parser->new(
>  api_version => 3,
>  start_h => [ sub{
>    my ($Self, $Text, $Tag, $Attr) = @_;
>    print "Tag is: ".$Tag;
>  }, "self, text, tagname, attr" ]
> );
> $parser->ignore_elements( qw( iframe ));
> $parser->ignore_tags( qw( iframe ));
> 
> output:
> Tag is: iframe/**/src="http://mail.ru"

HTML: <script/src="ya.ru"> wrong parse same

From [email protected] on 2010-04-04 20:38:08
:

I don't understand what rules you propose that HTML::Parser should follow to parse this kind of 
bogus HTML.  You think it should treat "/**/" and "/" as whitespace?

From [email protected] on 2010-06-01 07:13:54
:

Here 3 regular expressions applied to the entrance text correct this
problems:
s{(/\*)}{ $1}g;
s{(\*/)}{$1 }g;
s{(<[^/\s<>]+)/}{$1 /}g;

Probably you will find more correct architectural decision.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong parse [rt.cpan.org #55629] #12

Wrong parse [rt.cpan.org #55629] #12

oalders commented Aug 24, 2020

Wrong parse [rt.cpan.org #55629] #12

Wrong parse [rt.cpan.org #55629] #12

Comments

oalders commented Aug 24, 2020