Skip to content

Commit

Permalink
fix: Sanitize URL queries containing spaces correctly
Browse files Browse the repository at this point in the history
Spaces are encoded as `+` (or `%20`) in URLs. Unfortunately I decoded
this character as a simple `+` instead of a space. When I re-encoded the
query value, the `+` was then encoded as `%2B`, which was wrong as it
should have stayed as `+` (or `%20`).

Now, I we re-encode queries, the character `+` is correctly decoded to a
space, and then re-encoded to `+`.

Bug introduced in: c41afda
  • Loading branch information
marienfressinaud committed Feb 9, 2024
1 parent 244278d commit 49c324f
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 11 deletions.
23 changes: 15 additions & 8 deletions lib/SpiderBits/src/Url.php
Original file line number Diff line number Diff line change
Expand Up @@ -236,13 +236,20 @@ private static function normalizePath(string $path): string
/**
* Percent-decode a value.
*
* It applies rawurldecode() on the value as long as a percent-encoded
* character is detected.
* It applies rawurldecode() (or urldecode based on the raw parameter) on
* the value as long as a percent-encoded character is detected.
*/
private static function fullPercentDecode(string $value): string
private static function fullPercentDecode(string $value, bool $raw = true): string
{
while (preg_match('/%[0-9A-Fa-f]{2}/', $value) === 1) {
$value = rawurldecode($value);
while (
(preg_match('/%[0-9A-Fa-f]{2}/', $value) === 1) ||
(!$raw && str_contains($value, '+'))
) {
if ($raw) {
$value = rawurldecode($value);
} else {
$value = urldecode($value);
}
}

return $value;
Expand Down Expand Up @@ -280,17 +287,17 @@ private static function percentRecodeQuery(string $query): string
$decoded_parameters = [];
$parameters = self::parseQuery($query);
foreach ($parameters as $name => $value) {
$name = rawurlencode(self::fullPercentDecode($name));
$name = urlencode(self::fullPercentDecode($name, raw: false));
if (is_array($value)) {
$value = array_map(function ($partial_value) {
if ($partial_value === null) {
return null;
} else {
return rawurlencode(self::fullPercentDecode($partial_value));
return urlencode(self::fullPercentDecode($partial_value, raw: false));
}
}, $value);
} elseif ($value !== null) {
$value = rawurlencode(self::fullPercentDecode($value));
$value = urlencode(self::fullPercentDecode($value, raw: false));
}
$decoded_parameters[$name] = $value;
}
Expand Down
6 changes: 3 additions & 3 deletions tests/lib/SpiderBits/UrlTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -187,9 +187,9 @@ public static function sanitizeProvider(): array
['https://example.com/?foo=bar?&foo=baz?', 'https://example.com/?foo=bar%3F&foo=baz%3F'],
["https://domén-with-accent.com?query=with-àccent", 'https://xn--domn-with-accent-dqb.com/?query=with-%C3%A0ccent'], // phpcs:ignore Generic.Files.LineLength.TooLong
["https://host.com?query=with-%C3%A0ccent", 'https://host.com/?query=with-%C3%A0ccent'],
["https://host.com?utm_source=gazette%252B-%252Babonn%25C3%25A9s", 'https://host.com/?utm_source=gazette%2B-%2Babonn%C3%A9s'], // phpcs:ignore Generic.Files.LineLength.TooLong
["https://host.com?utm_source=gazette%2B-%2Babonn%25C3%25A9s", 'https://host.com/?utm_source=gazette%2B-%2Babonn%C3%A9s'], // phpcs:ignore Generic.Files.LineLength.TooLong
["https://host.com?utm_source=gazette+-+abonn%C3%A9s", 'https://host.com/?utm_source=gazette%2B-%2Babonn%C3%A9s'], // phpcs:ignore Generic.Files.LineLength.TooLong
["https://host.com?utm_source=gazette%252B-%252Babonn%25C3%25A9s", 'https://host.com/?utm_source=gazette+-+abonn%C3%A9s'], // phpcs:ignore Generic.Files.LineLength.TooLong
["https://host.com?utm_source=gazette%2B-%2Babonn%25C3%25A9s", 'https://host.com/?utm_source=gazette+-+abonn%C3%A9s'], // phpcs:ignore Generic.Files.LineLength.TooLong
["https://host.com?utm_source=gazette+-+abonn%C3%A9s", 'https://host.com/?utm_source=gazette+-+abonn%C3%A9s'], // phpcs:ignore Generic.Files.LineLength.TooLong
["http://evil.com/foo#bar/baz", 'http://evil.com/foo#bar/baz'],
["http://evil.com/foo#bar%22baz", 'http://evil.com/foo#bar%22baz'],
["http://evil.com/foo#🐘", 'http://evil.com/foo#%F0%9F%90%98'],
Expand Down

0 comments on commit 49c324f

Please sign in to comment.