Skip to content

Commit

Permalink
http-equiv validators (#102)
Browse files Browse the repository at this point in the history
  • Loading branch information
rviscomi authored Apr 18, 2024
1 parent 269d631 commit 4b9e9fd
Show file tree
Hide file tree
Showing 8 changed files with 733 additions and 99 deletions.
5 changes: 3 additions & 2 deletions crx/capo.js

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion crx/manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"manifest_version": 3,
"name": "Capo: get your ﹤𝚑𝚎𝚊𝚍﹥ in order",
"description": "Visualize the optimal ordering of ﹤𝚑𝚎𝚊𝚍﹥ elements on any web page",
"version": "1.4.11",
"version": "1.5.0",
"permissions": [
"scripting",
"activeTab",
Expand Down
72 changes: 64 additions & 8 deletions docs/src/content/docs/user/validation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,8 @@ title: Validation
description: Learn about how capo.js validates the head element
---

import { Image } from 'astro:assets';
import ValidationAreas from '../../../assets/validation-areas.png';

import { Image } from "astro:assets";
import ValidationAreas from "../../../assets/validation-areas.png";

The `<head>` element sets up all of the necessary metadata for a page to load properly and performantly. capo.js performs a number of validation checks on the `<head>` to ensure it meets modern best practices.

Expand All @@ -19,7 +18,11 @@ There are a few ways to see when an element is invalid:
- the element appears striped in the color bar
- the expanded console entry is annotated with an ❌ icon

<Image src={ValidationAreas} alt="Console logs showing all three ways elements are flagged as invalid." loading="eager" />
<Image
src={ValidationAreas}
alt="Console logs showing all three ways elements are flagged as invalid."
loading="eager"
/>

In the example above, you can see all three ways that an element can be flagged as invalid: top-level warning, striped color bar, and warning styles in its expanded entry.

Expand Down Expand Up @@ -62,7 +65,7 @@ If capo.js detects zero or more than one `<title>` element, it will log a valida

![Validation warning that "Expected exactly 1 title element, found 0"](../../../assets/validation-title.png)

In the example above, the `<title>` element is missing, so capo.js warns that "Expected exactly 1 `<title>` element, found 0".
In the example above, the `<title>` element is missing, so capo.js warns that "Expected exactly 1 `<title>` element, found 0".

## No more than one `<base>` element

Expand All @@ -74,9 +77,30 @@ If capo.js detects more than one `<base>` element, it will log a validation warn

In the example above, there is more than one `<base>` element, so capo.js warns that "Expected at most 1 `<base>` element, found 2".

## No `<meta>` CSP
## No invalid `<meta http-equiv>` elements

There are a handful of standardized pragma directives that can be used with the `http-equiv` attribute to change browser behavior. There are even some non-standard directives that browsers still choose support, the most notable being [`origin-trial`](#no-invalid-origin-trials). And there are actually a couple of standardized directives that are non-conforming, meaning that their use is totally discouraged, specifically: `content-language` and `set-cookie`.

:::caution
Contrary to popular belief, `http-equiv` meta tags _are not_ equivalent to HTTP headers. Learn more about why [you probably don't need `http-equiv` meta tags](https://rviscomi.dev/2023/07/you-probably-dont-need-http-equiv-meta-tags/).
:::

Even when using a standardized, supported, and conforming `http-equiv` directive, its `content` attribute value may still be invalid. For example, the `content-type` directive _must_ declare a character encoding of UTF-8. There are some additional restrictions on this directive, specifically the requirements that it be discovered within the first 1024 bytes and it doesn't coincide with the `<meta charset>` element.

Many of these cases are harmless no-ops, but capo.js will validate them and provide a helpful warning message with relevant metadata to help you understand whether it's safe to remove it or if anything needs to be changed for it to work properly.

According to the [W3C specification](https://w3c.github.io/webappsec-csp/#policy-delivery), a Content Security Policy (CSP) can be set as either an HTTP header or a `<meta http-equiv>` tag.
There are some cases where it's not so harmless:

- `set-cookie` doesn't actually set any cookies, which could lead to broken behavior.
- `content-type` character encoding can be declared too late, which may lead to performance issues.
- `content-security-policy` [breaks](https://issues.chromium.org/issues/40273969) Chrome's preload scanner, which may lead to performance issues.
- `content-security-policy` directives like `frame-ancestors` and `sandbox` are not supported in meta tags and may have been set mistakenly, breaking assumptions about the page's security.
- `description` and similar metadata may have been mistakenly assigned to the `http-equiv` meta attribute instead of the intended `name` attribute, which could break assumptions about the page's SEO.
- `refresh` can reload or redirect the page and is known to cause [accessibility issues](https://www.w3.org/TR/WCAG10-HTML-TECHS/#meta-element).

### No `<meta>` CSP

According to the [W3C specification](https://w3c.github.io/webappsec-csp/#policy-delivery), a Content Security Policy (CSP) can be set as either an HTTP header or a `<meta http-equiv>` tag.

Despite `<meta>` CSP declarations being technically valid, per the spec, browsers handle them differently. In particular, Chrome will disable the [preload scanner](https://web.dev/preload-scanner/) if it discovers a CSP declared after a `<script>` element. The preload scanner can improve performance by 20%, so this behavior has major implications on the user experience.

Expand All @@ -92,7 +116,7 @@ In the example above, there is a `<meta>` CSP element, so capo.js warns that "CS

This validation warning is an example of capo.js being more opinionated than simply following the specification. The warning includes a recommendation to use the CSP header instead, which avoids the preload scanner issue all together. Also note that the `Content-Security-Policy-Report-Only` directive is only valid as an HTTP header and not as a `<meta http-equiv>` element.

## No invalid origin trials
### No invalid origin trials

Sites can register for [origin trials](https://developer.chrome.com/en/docs/web-platform/origin-trials/) to enable individual experimental web platform features. To enable them on a given site, a token must be included as either an `Origin-Trial` HTTP header or `<meta http-equiv>` element.

Expand Down Expand Up @@ -123,3 +147,35 @@ In the example above, two separate embedded third party scripts injected origin
In the first warning, the token contains an invalid origin. The token metadata is missing the `isThirdParty` flag and the `origin` property is set `https://googlesyndication.com:443`, which is presumably the third party that injected the token. However, because the origin of the page is different from the one in the origin trial metadata, and it wasn't registered as a third party token, it's not valid. A similar warning would appear if the origin of the page is `https://www.example.com` but the origin in the metadata is `https://example.com:443` and it's missing the `isSubdomain` flag.

In the second warning, the origin is a valid third party, but the token is expired. In the token metadata, you can see that it expired in November 2022.

### No invalid `default-style` directives

The `default-style` directive indicates which [alternative stylesheet](https://developer.mozilla.org/en-US/docs/Web/CSS/Alternative_style_sheets) should be enabled. While it's completely standard and supported across browsers, [adoption is extremely low](https://rviscomi.dev/2023/07/you-probably-dont-need-http-equiv-meta-tags/#http-equiv-adoption) at only about 1k websites. Of those, about [31%](https://rviscomi.dev/2023/07/you-probably-dont-need-http-equiv-meta-tags/#default-style) are used incorrectly.

To be used correctly, the `content` attribute of the directive must be equal to the `title` value of an alternative stylesheet. capo.js will warn when the title cannot be found and the directive has no effect.

In addition to flagging incorrect usage, capo.js will also discourage using this directive entirely. Setting a preferred stylesheet results in a flash of unstyled content, which could be avoided by using default stylesheets with `@media` rules instead.

### No invalid character encoding

The `content-type` directive is an alias for the `charset` meta tag, which declares the document's content encoding. The [HTML spec](https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type) places strict requirements on how and where this declaration can occur:

- The document cannot have both a `content-type` directive and a `charset` meta tag
- If one does exist, the character encoding declaration must be found within the firsts 1024 bytes of the document
- If one does exist, the character encoding must be set to UTF-8

It's valid for a document not to have either `content-type` nor `charset` meta tags, as long as the `Content-Type` HTTP header is set, also to UTF-8. capo.js does not presently validate HTTP headers, only the contents of the document `<head>`.

capo.js will validate that all three of the requirements above are met, and log a warning if not. If there are redundant character encoding declarations, capo.js will warn on the `content-type` element, giving preference to the `charset` meta tag. If the declaration occurs too late, capo.js will include the byte index for reference. And if the encoding is not set to UTF-8, capo.js will log the actual encoding used.

### No meta refresh

The `refresh` directive can force the page to reload or redirect after a specified amount of time. This is considered an [accessibility issue](https://www.w3.org/TR/WCAG10-CORE-TECHS/#auto-page-refresh) by the WCAG. It can be disorienting to users for the page to suddenly reload or redirect, and there are more semantic ways to indicate that a page's contents have moved.

capo.js issues a warning whenever a meta `refresh` directive is found. If it contains a redirect, capo.js encourages using HTTP 3xx redirects instead. Otherwise, it encourages including a mechanism for users to disable the auto-reloading behavior, if one is not already provided by browsers.

### Don't disable DNS prefetching

The [`x-dns-prefetch-control`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-DNS-Prefetch-Control) HTTP header and meta directive are non-standard, but widely supported. By default, most browsers will speculatively resolve the DNS records for URLs needed by the page, which is a generally accepted performance optimization.

In Chrome, any value other than `on` will disable the default DNS prefetching behavior. capo.js will always warn when this directive is found, and the message will be unique to the usage. In cases when DNS prefetching is explicitly enabled, capo.js clarifies that this has no effect as it's already the default behavior. In cases where it's set to `off`, capo.js emphasizes the performance benefits of DNS prefetching and clarifies that it should only be disabled when there are legitimate security concerns. For all other values besides `on` and `off`, capo.js warns about using non-standard values and reiterates the performance and security considerations.
Loading

0 comments on commit 4b9e9fd

Please sign in to comment.