Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: leading method comment is outside method body #272

Open
2 tasks done
adamcohen2 opened this issue Nov 20, 2024 · 0 comments
Open
2 tasks done

bug: leading method comment is outside method body #272

adamcohen2 opened this issue Nov 20, 2024 · 0 comments
Labels

Comments

@adamcohen2
Copy link

adamcohen2 commented Nov 20, 2024

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-c

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter 0.24.4 (fc8c1863e2e5724a0c40bb6e6cfc8631bfe5908b)

Describe the bug

When calling NextSibling on a leading method comment, the entire method body is returned, instead of only the next statement.

For example, given the following ruby code:

def foo
  # comment1
  puts "statement 1"
  puts "statement 2"
  # comment2
  puts "statement 3"
end

foo

If a node is pointing to # comment1 and I call node.NextSibling(), I would expect this node to be: puts "statement 1", however, it's actually the entire function body:

  puts "statement 1"
  puts "statement 2"
  # comment2
  puts "statement 3"

Is this a bug, or this by design? Because tree-sitter-go doesn't behave this way, and returns the single-line statement following the comment node as expected:

Click to expand example code
package main

import (
	"fmt"

	tree_sitter "github.com/tree-sitter/go-tree-sitter"
	tree_sitter_go "github.com/tree-sitter/tree-sitter-go/bindings/go"
	tree_sitter_ruby "github.com/tree-sitter/tree-sitter-ruby/bindings/go"
)

func main() {
	rubySource := []byte(`
def foo
  # comment1
  puts "statement 1"
  puts "statement 2"
  # comment2
  puts "statement 3"
end

foo
	`)

	goSource := []byte(`
package main

import "fmt"

func foo() {
	// comment1
	fmt.Println("statement 1")
	fmt.Println("statement 2")
	// comment2
	fmt.Println("statement 3")
}

func main() {
	foo()
}
`)

	rubyLang := tree_sitter.NewLanguage(tree_sitter_ruby.Language())
	OutputCommentAndStatement("comment1", "#", "ruby", rubySource, rubyLang)
	OutputCommentAndStatement("comment2", "#", "ruby", rubySource, rubyLang)

	goLang := tree_sitter.NewLanguage(tree_sitter_go.Language())
	OutputCommentAndStatement("comment1", "//", "golang", goSource, goLang)
	OutputCommentAndStatement("comment2", "//", "golang", goSource, goLang)
}

func OutputCommentAndStatement(comment, commentSymbol, langName string, sourceCode []byte, language *tree_sitter.Language) {
	parser := tree_sitter.NewParser()
	defer parser.Close()
	parser.SetLanguage(language)

	tree := parser.Parse(sourceCode, nil)
	defer tree.Close()

	rawQuery := fmt.Sprintf(`((comment) @comment (#match? @comment "^%s %s"))`, commentSymbol, comment)

	query, _ := tree_sitter.NewQuery(language, rawQuery)
	defer query.Close()

	qc := tree_sitter.NewQueryCursor()
	defer qc.Close()

	captures := qc.Captures(query, tree.RootNode(), sourceCode)

	for match, index := captures.Next(); match != nil; match, index = captures.Next() {
		node := match.Captures[index].Node

		sibling := node.NextSibling()
		fmt.Printf("languge: %s\n", langName)
		fmt.Printf("sibling node contents for %s\n", comment)
		fmt.Println("\nBEGIN STATEMENT")
		fmt.Printf("%s\n", sibling.Utf8Text(sourceCode))
		fmt.Println("END STATEMENT\n")
	}
}

output:

languge: ruby
sibling node contents for comment1

BEGIN STATEMENT
puts "statement 1"
  puts "statement 2"
  # comment2
  puts "statement 3"
END STATEMENT

languge: ruby
sibling node contents for comment2

BEGIN STATEMENT
puts "statement 3"
END STATEMENT

languge: golang
sibling node contents for comment1

BEGIN STATEMENT
fmt.Println("statement 1")
END STATEMENT

languge: golang
sibling node contents for comment2

BEGIN STATEMENT
fmt.Println("statement 3")
END STATEMENT

Steps To Reproduce/Bad Parse Tree

Input code:

def foo
  # comment1
  puts "statement 1"
  puts "statement 2"
  # comment2
  puts "statement 3"
end

Parse tree. Leading method comment is unexpectedly outside method body:

(program [0, 0] - [6, 3]
  (method [0, 0] - [6, 3]
    name: (identifier [0, 4] - [0, 7])
    (comment [1, 2] - [1, 12])
    body: (body_statement [2, 2] - [5, 20]
      (call [2, 2] - [2, 20]
      <snip>

Expected Behavior/Parse Tree

Expected parse tree, leading method comment should be inside method body:

(program [0, 0] - [6, 3]
  (method [0, 0] - [6, 3]
    name: (identifier [0, 4] - [0, 7])
    body: (body_statement [2, 2] - [5, 20]
      (comment [1, 2] - [1, 12])
      (call [2, 2] - [2, 20]
      <snip>

Repro

def foo
  # comment1
  puts "statement 1"
  puts "statement 2"
  # comment2
  puts "statement 3"
end
@adamcohen2 adamcohen2 added the bug label Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant