Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: JSONL codegen doesn't include framepath #34157

Open
avdempsey opened this issue Dec 29, 2024 · 3 comments
Open

[Bug]: JSONL codegen doesn't include framepath #34157

avdempsey opened this issue Dec 29, 2024 · 3 comments
Labels
open-to-a-pull-request The feature request looks good, we are open to reviewing a PR

Comments

@avdempsey
Copy link

Version

1.49.1

Steps to reproduce

  1. Create page with an iframe
  2. Add an h1 tag with the same text to both the container page and the iframe page
  3. Fire up codegen and click the h1 tag in the iframe
  4. Compare the generated JSONL against generated NodeJS

NodeJS:
await page.locator('iframe').contentFrame().getByRole('heading', { name: 'Frame' }).click();

JSONL:

{"name":"click","selector":"internal:role=heading[name=\"Frame\"i]","signals":[],"button":"left","modifiers":0,"clickCount":1,"pageAlias":"page","locator":{"kind":"role","body":"heading","options":{"attrs":[],"exact":false,"name":"Frame"}}}

Expected behavior

I expected to see some indication of the framepath in the JSONL. Since it's missing, the JSONL of the recorded action is ambiguous.

Actual behavior

NodeJS:
await page.locator('iframe').contentFrame().getByRole('heading', { name: 'Frame' }).click();

JSONL:

{"name":"click","selector":"internal:role=heading[name=\"Frame\"i]","signals":[],"button":"left","modifiers":0,"clickCount":1,"pageAlias":"page","locator":{"kind":"role","body":"heading","options":{"attrs":[],"exact":false,"name":"Frame"}}}

Additional context

I was hoping to use JSONL output of codegen as a safe serialization format for basic Playwright interactions. I was working on writing a program that takes the JSONL and turns it into Python. That seems like it could still work for many cases, but looking at the code in playwright-core/src/server/codegen/jsonl.ts I see it's much simpler than the other LanguageGenerator instances, and maybe there are other edge cases out there.

Still, this JSONL feature is pretty awesome (as is Playwright, thank you for all that you do). I think I can make a workaround elsewhere in my own workflow.

Environment

System:
    OS: macOS 15.1.1
    CPU: (11) arm64 Apple M3 Pro
    Memory: 117.70 MB / 36.00 GB
  Binaries:
    Node: 20.12.0 - ~/.nvm/versions/node/v20.12.0/bin/node
    Yarn: 1.22.22 - ~/.nvm/versions/node/v20.12.0/bin/yarn
    npm: 10.9.0 - ~/.nvm/versions/node/v20.12.0/bin/npm
  Languages:
    Bash: 5.2.37 - /opt/homebrew/bin/bash
@Skn0tt Skn0tt self-assigned this Dec 30, 2024
@Skn0tt Skn0tt added the v1.50 label Dec 30, 2024
@Skn0tt
Copy link
Member

Skn0tt commented Dec 30, 2024

Hi Alex! I'll discuss this with the team. From a technical perspective, we could add a frame path to the JSON, but i'm unsure about the state of JSONL. You're one of the first people to report using it. Could you elaborate on your usecase? Why is it useful for you to transform JSONL to Python, instead of using the Playwright-generated Python in the first place?

@Skn0tt Skn0tt removed the v1.50 label Dec 30, 2024
@avdempsey
Copy link
Author

Hi Simon, for context I work at Internet Archive where we're using Playwright for some web preservation projects.

I'm trying to create a fast and secure workflow where a web archivist can record behaviors with Playwright codegen and the inspector. These behaviors would go into a database to assist current and future web crawls, and could potentially serve as tests that our replay of archived web pages continue to work in the future.

I'm interested in jsonl principally for security. I'm imagining behaviors being recorded every day, by non-software engineers, without code review. Our preservation systems are distributed and run on different platforms (Python, Java, JS, and Go). Putting the behaviors in a database allows crawls in progress to pick them up, and I'm scared of the idea of execing Python code fetched from a database attached to a web interface.

Since we do have diverse languages and technologies in play, and we're trying to build a library to stand the test of time, it's also appealing to me to store these behaviors and tests in a language agnostic format.

This codegen workflow is experimental, but if it works it could help us bring many more librarians and archivists into high fidelity web preservation work.

Thanks again for your work, Playwright and the codegen+inspector tools are inspiring!

@dgozman dgozman added the v1.50 label Jan 2, 2025
@ruifigueira
Copy link
Contributor

I also needed the framepath on my playwright fork so that recorded scripts could be replayed. I ended up patching it and adding a framepath property:

https://github.com/ruifigueira/playwright-crx/blob/7c4216379ae25bf02078854f9d635d81bbf68b3d/playwright/packages/playwright-core/src/server/codegen/jsonl.ts#L32

Meanwhile, and because I'm adding a new feature to allow editing the generated code directly in codegen chrome extension, I moved on to a different approach and rolled back that change, but nevertheless, jsonl format seems incomplete without framepath.

@dgozman dgozman added open-to-a-pull-request The feature request looks good, we are open to reviewing a PR and removed v1.50 labels Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open-to-a-pull-request The feature request looks good, we are open to reviewing a PR
Projects
None yet
Development

No branches or pull requests

4 participants