Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deparser bug: Incorrect string escaping #124

Open
rwe opened this issue Jul 7, 2024 · 0 comments
Open

deparser bug: Incorrect string escaping #124

rwe opened this issue Jul 7, 2024 · 0 comments

Comments

@rwe
Copy link

rwe commented Jul 7, 2024

Currently, deparse represents most string literals escaped only by single-quote doubling.
If special characters like backslashes occur, they're output literally, possibly resulting in invalid data. (How and whether those backslashes are interpreted is configuration-dependent).

Comment statements (COMMENT ON TABLE "foo" IS …) are, for some reason, the only string literals that are rendered with an attempt to use C-style "escape" strings (E'foo'), which consistently interpret backslash-escaping.

However, the rendering logic for those is completely wrong. It doesn't do the requisite quote doubling (or, alternatively, quote-escaping), and instead of doubling backslashes, it replaces them with the same number of them.

Effectively comments are rendered worse than unescaped, since they both still open for single-quote mangling, and additionally opened up for backslash interpolation.

That means that a validly written comment like:

COMMENT ON COLUMN "foo"."whatever" IS $$
Something blah, this data may have chars like '\n' and '\r' in it.
$$;

Gets rewritten as:

COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like '\n' and '\r' in it.
';

which is completely invalid syntax since the single quotes close on the single quote before the [\, n] chars.

If you change that text to '\n' to "\n" (e.g. to pretend that the single quotes had been correctly escaped), then it outputs as:

COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like "\n" and "\r" in it.
';

This parses, but not as the original comment. When re-parsed, becomes:

COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like "
" and "
" in it.
';

Instead, Deparser.escape should be written something like:

class Deparser {
  // ...
  escape(str: string): string {
    // If the string contains backslashes, render an escape string using the E'…' syntax.
    // Single quotes and backslashes are escaped with a backslash.
    if (str.includes('\\')) {
      return `E'${str.replaceAll(/[\\']/g, '\\$&')}'`;
    }
    // No backslashes: double any internal single quotes.
    return `'${str.replaceAll(/'/g, '$&$&')}'`;
  }

  escapeNull(str: string | null | undefined): string {
    return str == null ? 'NULL' : this.escape(str);
  }
}

and the relevant lines in CommentStmt (

const escapeComment = (str) => {
return str.replace(/\\/g, '\\');
};
if (node.comment) {
if (/[^a-zA-Z0-9]/.test(node.comment)) {
// special chars we care about...
output.push(`E'${escapeComment(node.comment)}'`);
} else {
// find a double \\n or \\ something...
output.push(`'${node.comment}'`);
}
} else {
output.push('NULL');
}
) should be:

-const escapeComment = (str) => {
-  return str.replace(/\\/g, '\\');
-};
-
-if (node.comment) {
-  if (/[^a-zA-Z0-9]/.test(node.comment)) {
-    // special chars we care about...
-    output.push(`E'${escapeComment(node.comment)}'`);
-  } else {
-    // find a double \\n or \\ something...
-    output.push(`'${node.comment}'`);
-  }
-} else {
-  output.push('NULL');
-}
+output.push(this.escapeNull(node.comment));
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant