Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: copy posthog js processing #4

Closed
wants to merge 21 commits into from

Conversation

pauldambra
Copy link
Member

@pauldambra pauldambra commented Mar 10, 2024

was originally stacked on top of #2

see PostHog/meta#182

copies the user agent parsing behavior from posthog-js unchanged and adds config to let us deploy the plugin allowing that processing (for events that aren't expecting the original behavior)

confirmed locally with v3 processing enabled

Screenshot 2024-03-13 at 00 34 00

@pauldambra pauldambra requested a review from a team March 10, 2024 19:02
@pauldambra pauldambra changed the title Feat/copy posthog js processing feat: copy posthog js processing Mar 10, 2024
jest.config.js Outdated Show resolved Hide resolved
dist/index.js Outdated Show resolved Hide resolved
src/v3/README.md Show resolved Hide resolved
@pauldambra pauldambra mentioned this pull request Mar 12, 2024
@pauldambra pauldambra changed the base branch from chore/always-include-latest-built-version to main March 13, 2024 00:36

const vendor = event.properties['$navigator_vendor']

event.properties['$device'] = detectDevice(userAgent)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we use $device_name and/or $device_name on Mobile, I wonder if we should unify that? otherwise the filtering experience isn't ideal unless the filtering considers all those variations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this I want to keep it the same as posthog-js so that we can validate the switchover

But then once we've got this in place it'll be "easy" to start adding other platforms/sdks

event.properties['$browser'] = detectBrowser(userAgent, vendor)
event.properties['$browser_version'] = detectBrowserVersion(userAgent, vendor)
const [osName, osVersion] = detectOS(userAgent)
event.properties['$os'] = osName
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as https://github.com/PostHog/user-agent-plugin/pull/4/files#r1522710965
We use $os_name on Mobile, and $os_version

@robbie-c
Copy link
Member

Apologies for the scope creep suggestions: it looks like there's a new spec for user agents https://developer.mozilla.org/en-US/docs/Web/API/Navigator/userAgentData. It's experimental, but we might want to start getting the sdk sending it, so that we can use it in the future without waiting for people to update their sdk.

Scope creep 2: bot detection. We also parse the user agent for bot detection - this is a long list of strings that we maintain. There are a few other sources of this data which are likely more complete than ours, e.g. https://github.com/monperrus/crawler-user-agents, https://github.com/omrilotan/isbot, https://user-agents.net/bots, but if we added all of these to posthog-js it would increase our bundle size by quite a bit. We could do filtering of the largest bots in posthog-js (to keep our costs low) but do a more thorough filtering server-side (to keep our customer's costs low and their data accurate).

@pauldambra
Copy link
Member Author

Apologies for the scope creep suggestions: it looks like there's a new spec for user agents developer.mozilla.org/en-US/docs/Web/API/Navigator/userAgentData. It's experimental, but we might want to start getting the sdk sending it, so that we can use it in the future without waiting for people to update their sdk.

totally... this area feels pretty unlooked after some times. so makes sense to grab data when it's available (even if we ignore it for now) - so will punt this to posthog-js follow up?

@pauldambra
Copy link
Member Author

Scope creep 2: bot detection. We also parse the user agent for bot detection - this is a long list of strings that we maintain. There are a few other sources of this data which are likely more complete than ours, e.g. monperrus/crawler-user-agents, omrilotan/isbot, user-agents.net/bots, but if we added all of these to posthog-js it would increase our bundle size by quite a bit. We could do filtering of the largest bots in posthog-js (to keep our costs low) but do a more thorough filtering server-side (to keep our customer's costs low and their data accurate).

Yeah, would be cool to keep some limited/obvious bot detection in posthog-js and then do more thorough detection in the plugin server (although we did have a customer request to be able to skip bot detection so we'd need to figure out how to honour that too)

@pauldambra
Copy link
Member Author

and I think the user agent detection change recently in PostHog/posthog-js#1359 so we'd need to update this

@pauldambra
Copy link
Member Author

this plugin won't. be the way forward here. see: https://posthog.slack.com/archives/C05LJK1N3CP/p1724753326948729

@pauldambra pauldambra closed this Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants