Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "PrettyPrinterPerformance Optimized the PrettyPrinter for #894" #906

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

bnbarham
Copy link
Contributor

Reverts #901 which now computes the column on codepoints rather than characters.

@bnbarham
Copy link
Contributor Author

bnbarham commented Dec 20, 2024

CC @macshome (sorry, we'll get there!)

I haven't benchmarked this at all, but we could do something like:

    guard let lastNewlineIndex = text.lastIndex(of: "\n") else {
      // In case there's no newline, the string itself is the only line
      column += text.count
      return
    }

    let lastLine = text[lastNewlineIndex...]
    column += lastLine.count

    // Now count the rest of the lines in a possible multi-line string. We are only interested in '\n' so we can use
    // the UTF8 view and skip grapheme clustering entirely.
    for element in text[...lastNewlineIndex].utf8 {
      if element == UInt8(ascii: "\n") {
        lineNumber += 1
      }
    }

Which should hopefully result in a single iteration still (though I'm unsure of the slices). But basically - iterate backwards to find the last newline, grab column count during/from there, then iterate forwards up to that index to get the other newlines (which can still use the utf8 view).

Couple extra tests we could add:

  func testCharacterVsCodepoint() {
    let input =
      """
      let fo = 1  // 🤥
      
      """

    assertPrettyPrintEqual(
      input: input,
      expected: input,
      linelength: 16,
      whitespaceOnly: true,
      findings: []
    )
  }

  func testCharacterVsCodepointMultiline() {
    let input =
      #"""
      /// This is a multiline
      /// comment that is in 🤥
      /// fact perfectly sized
      
      """#

    assertPrettyPrintEqual(
      input: input,
      expected: input,
      linelength: 25,
      whitespaceOnly: true,
      findings: []
    )
  }

@bnbarham bnbarham enabled auto-merge December 20, 2024 20:40
@bnbarham bnbarham force-pushed the revert-901-PrettyPrinterPerformance branch from 978e881 to 8073ddc Compare December 20, 2024 23:01
@bnbarham bnbarham merged commit 2b7e6d8 into main Dec 20, 2024
19 checks passed
@bnbarham bnbarham deleted the revert-901-PrettyPrinterPerformance branch December 20, 2024 23:14
@ahoppen
Copy link
Member

ahoppen commented Jan 8, 2025

@macshome Would you be interested in trying something like what @bnbarham suggested to count characters instead of codepoints?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants