Handling Escape characters #1842
Replies: 3 comments 8 replies
-
Hey @esatterwhite, you might want to give a bit more context to the problem. Handling escape characters without clear delimiters isn't really easy and quite context dependent. Since they're escaped, I assume these escaped color tokens should be lexed as something else? Let's say you have a color token and a string token, which also accepts escaped colors. This should look something like this: const color = createToken({
name: "color",
pattern: /#[0-9a-zA-F]{6}/
});
const text = createToken({
name: "string",
pattern: /"[^"]*"|'[^']*'|\\#[0-9a-zA-F]{6}/ // A normal string or an escaped color
}); Is this what you're looking for? |
Beta Was this translation helpful? Give feedback.
-
I also (had) a question along these lines. My use case is that I would like to parse a query syntax that may contain quoted strings. For example: Whenever you want to accept arbitrary strings, you also need to have a way to escape quotes within them: It seemed like multi-mode lexing was a great fit for this: when parsing at a position where I can accept a string, a literal I initially implemented like this: const QDouble = createToken({ name: "QDouble", pattern: /"/, label: '"', push_mode: 'double_quoted' });
const NotQDouble = createToken({ name: "NotQDouble", pattern: /[^"]+/, label: "str" });
const QDoubleEnd = createToken({ name: "QDoubleEnd", pattern: /"/, label: '"', pop_mode: true });
const ESC = createToken({ name: "ESC", pattern: /\\./, label: '\\' });
const CalculatorLexer = new Lexer({
modes: {
"query": allTokens,
"single_quoted": [ESC, NotQSingle, QSingleEnd],
"double_quoted": [ESC, NotQDouble, QDoubleEnd],
},
defaultMode: "query"
}); This wasn't working as expected, even with "NotQDouble" accepts backslashes, so the "ESC" token never has a chance if it's not the first character in the list, no matter what order I tell the lexer to process the tokens in. The solution is simple: const NotQDouble = createToken({ name: "NotQDouble", pattern: /[^"\\]+/, label: "str" }); (note the addition of |
Beta Was this translation helpful? Give feedback.
-
I'm looking for some suggestions around the best way to handle an escape character. I understand there are parser modes, but that requires a start and stop character. Which would effectively be like quoting - which is what we do now.
But and example would be
#
is a part of my syntax, but there us a use case where CSS hex codes show up#CCCCCC
which we don't want parsed. Now we tell people to quote them"#CCCCCC"
but it would be nice to have an explicit escape char to bail out of the parsing.\#CCCCCC
Javascript does something similar with quoted strings
'\"literal quoted'
and it doesn't necessarily have to have a matching close characterBeta Was this translation helpful? Give feedback.
All reactions