Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YaraSerializer modifies condition section #17

Open
SaretX opened this issue Mar 18, 2020 · 1 comment
Open

YaraSerializer modifies condition section #17

SaretX opened this issue Mar 18, 2020 · 1 comment

Comments

@SaretX
Copy link

SaretX commented Mar 18, 2020

In the process of working with your library I found out that when you parse a ruleset from string and then, at some point, you want to serialize one of the parsed rules back to string, numbers in condition section will be converted to decimal form, e.g:

rule := ruleset.Rules[0]
newRuleset := &pb.RuleSet{
	Rules: []*pb.Rule{rule.AsProto()},
}

buf := &bytes.Buffer{}
serializer := gyp.NewSerializer(buf)
serializer.Serialize(newRuleset)

as a result, for condition like: uint16 ( 0 ) == 0x5a4d and filesize < 40KB I got uint16(0) == 23117 and filesize < 40960.

Is it possible to add some kind of flag or something to save condition section as string rather than as Expression? Or maybe there's any other way to get parsed rule as a string?

Thanks!

@malvidin
Copy link

The condition is almost definitely not going to be a string, and related changes would also impact parsing of the meta and strings sections. The expressions are used in the parser to ensure that the condition is valid.

YARA treats all numbers the same way, so this is also a valid representation of that condition.

uint16be(0000MB) == 19802 and filesize < 0o120000

Tagging it at hexadecimal, octal, or decimal might be possible, if you want to try. It might require adding two additional literals, one for hex integers and one for octal integers.
I don't think that a literal would be needed for KB/MB, if you assume that KB or MB form is preferred for any decimal integer.

There are three parts of the lexer that convert the tokens to _NUMBER_, starting here.

{digit}+(MB|KB){0,1} {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants