Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepared statement does not support unicode values #1813

Closed
kaciakmaciak opened this issue Aug 13, 2024 · 4 comments
Closed

Prepared statement does not support unicode values #1813

kaciakmaciak opened this issue Aug 13, 2024 · 4 comments

Comments

@kaciakmaciak
Copy link

What happens?

Prepared insert statement does not support passing values with unicode characters defined using \u. Anything after such character is escaped.

To Reproduce

const conn = await db.connect();
await conn.query(`
  CREATE TABLE "Test" (value VARCHAR)
`);
const stmt = await conn.prepare(`
  INSERT INTO "Test" (value)
  VALUES (?)
`);
await stmt.query('🦆🦆🦆🦆🦆');
await stmt.query('goo␀se');
await stmt.query('goo\u0000se');
const result = await conn.query(`
  SELECT * FROM "Test"
`);
console.log(result.toArray().map((item) => item.toJSON()));  // Returns '🦆🦆🦆🦆🦆', 'goo␀se', 'goo'

Please see an example in Codesandbox.

Browser/Environment:

Any

Device:

Any

DuckDB-Wasm Version:

1.28.1-dev248.0

DuckDB-Wasm Deployment:

N/A

Full Name:

Katarina Anton

Affiliation:

individual

@kaciakmaciak kaciakmaciak changed the title Prepared statement does not support unicode Prepared statement does not support unicode values Aug 13, 2024
@carlopi
Copy link
Collaborator

carlopi commented Aug 21, 2024

Thanks for raising this, problem here is connected to usage of the JSON.stringify while going from JS object to a representation more closely related to C++, that is basically char a[].

In particular we use JSON.stringify, that is not information preserving.

Solution here might involve re-implementing JSON.stringify / providing a custom replacer to it, not really looking forward to, but I don't see many great way out.

@carlopi
Copy link
Collaborator

carlopi commented Aug 22, 2024

I fixed this in #1823.

There will still a problem open: going through the native JS-api means strings are treated as JavaScript Strings (and not as Uint8Array, that closely match char* C++ semantics), as such some normalisations are performed, and for example goo\u0000se will be normalised to goo\x00se.

I am not 100% happy with the fix, but it's hard to say what is the intended semantic (and a solid implementation) for it. This is a more general problem, shared with Python and node.js bindings.

Escaping the backslash with another \ will have those be kept as intended.

Note that going through SQL (eg. the duckdb-wasm shell) this is working as intended.

I am inclined to consider this as solved.

@carlopi
Copy link
Collaborator

carlopi commented Aug 22, 2024

I will close the issue for now, but feel free to comment / reopen if this is not satisfactory.

@carlopi carlopi closed this as completed Aug 22, 2024
@kaciakmaciak
Copy link
Author

Thanks heaps for a quick fix 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants