Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LuaDoc. Fixed the start position of the comment first symbol in docs #3028

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

TIMONz1535
Copy link
Contributor

@TIMONz1535 TIMONz1535 commented Jan 4, 2025

The positions were calculated slightly incorrectly there, because it got confused with the lengths.

---@param a string?Comment
local function new(a) end

---@param a string[]Comment
local function new(a) end

now its Comment instead of omment.

изображение

But I came across the fact that the comment texts comment.text do not contain -- at the beginning, because this token is uuuh... being skipped?

That is, we were working with a string of a different length and start pos (-2). I made a variable local textOffset = 2 so that it could be easily fixed in the future.

p.s. Maybe you should figure out why this is happening.

upd. Please see the latest message for actual changes.

script/parser/luadoc.lua Outdated Show resolved Hide resolved
@tomlau10
Copy link
Contributor

tomlau10 commented Jan 5, 2025

But I came across the fact that the comment texts comment.text do not contain -- at the beginning, because this token is uuuh... being skipped?

Regarding this question, I think it makes sense because it stores the body of a comment 🤔

--foo bar
  • the comment body is only foo bar, not --foo bar
  • the prefix -- is not part of the comment body

@TIMONz1535
Copy link
Contributor Author

TIMONz1535 commented Jan 6, 2025

Damn, I found an issues

-- length = comment.finish - comment.start

-- length 26, relative finish 19, actual comment.text `-@param a string?Comment` (24 length, without 2)
-- ok
---@param a string?Comment

-- length 27, relative finish 18, actual comment.text `@param a string?Comment` (23 length, without 4!)
-- also real line length 29! `--` token was skipped before "calculating positions"
-- ok comment but wrong colorizing
--[[@param a string?Comment]]

 -- also wrong colorizing, so no comment issue
--[[@param a string?]]

-- length 33, relative finish 21, actual comment.text `@param a string?Comment` (23 length, without 10!!!)
-- wrong our offset 2, now comment text is `ment` :<
-- also wrong colorizing
--[===[@param a string[]Comment]===]

{20A06A79-94EB-452E-8A70-B9ED68299D84}

I could assume to search for the first character @, but most likely this function is not only for ---@stuff annotations?

Also the expression text:find('%S' seems pointless, because we're already doing util.trim (trimTailComment). Maybe it should have eliminated the empty spaces comments ---@param a string? . but we can never have the end char pos of a string >= comment.finish

@tomlau10
Copy link
Contributor

tomlau10 commented Jan 7, 2025

If you believe this PR is still not complete, I think the first thing is to turn it into draft:
=> this prevents maintainer accidentally merge the in progress PR 🤔 until everything sorts out
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/changing-the-stage-of-a-pull-request#converting-a-pull-request-to-a-draft


I will try to respond to the new found issues one by one when I got time 👀

@tomlau10
Copy link
Contributor

tomlau10 commented Jan 7, 2025

the unconsidered comment.long type, worked by chance

-- length 27, relative finish 18, actual comment.text `@param a string?Comment` (23 length, without 4!)
-- also real line length 29! `--` token was skipped before "calculating positions"
-- ok comment but wrong colorizing
--[[@param a string?Comment]]
  • so one of the new issues found is that, for comment.long type, although skipping 2 chars for it works, but the underline principle is different:
    • the actual comment.text is only @param a string?Comment, at first glance we would expect 4 chars are skipped
    • but no, comment.start is 2 in this case, which is what you are saying: "-- token was skipped before "calculating positions"
    • -4 + 2 => it is still -2, so the current convertLinePosToCommentPos() works for this case
  • but it will not work when using nested quote style, in your next example 🐛

not working on nested quote comment

-- length 33, relative finish 21, actual comment.text `@param a string?Comment` (23 length, without 10!!!)
-- wrong our offset 2, now comment text is `ment` :<
-- also wrong colorizing
--[===[@param a string[]Comment]===]
  • 🤔 as for this case, I notice that there is a comment.mark, which is actual extra skipped chars [===[ 💡
  • actually the existing logic already used this value when calculating startOffset 👀
    local startOffset = comment.start
    if comment.type == 'comment.long' then
    startOffset = startOffset + #comment.mark - 2
    end
  • by reviewing the case above --[[@param a string?Comment]], comment.mark = [[
    => this is indeed what we need 😄
    => so we can have a generalized form for the conversion functions pair:
local function convertLinePosToCommentPos(comment, linePos)
    -- some new comment explaining the new formula
    local textOffset = comment.type == 'comment.long' and #comment.mark or 2
    return linePos - comment.start + 1 - textOffset
end

local function convertCommentPosToLinePos(comment, commentPos)
    -- the reverse of convertLinePosToCommentPos()
    local textOffset = comment.type == 'comment.long' and #comment.mark or 2
    return commentPos + textOffset - 1 + comment.start
end

With this fix to both convertLinePosToCommentPos and convertCommentPosToLinePos, now all comments can be recognized correctly now, including your last test case --[===[@param a string[]Comment]===] 👍
But still the coloring for ? after string seems wrong 🤔 and currently I have no idea why.

@TIMONz1535 TIMONz1535 marked this pull request as draft January 7, 2025 13:00
@tomlau10
Copy link
Contributor

tomlau10 commented Jan 7, 2025

Also the expression text:find('%S' seems pointless, because we're already doing util.trim (trimTailComment). Maybe it should have eliminated the empty spaces comments ---@param a string? .

After some thought, this is not pointless 🤔

  • %S means non space characters
  • by using text:find('%S', <comment start offset>), we are filtering out empty comments
    • cstart = nil if no valid comment string is found
    • then the logic will not execute the if case below
    • => therefore we can avoid generating empty result.comment objects
  • otherwise for the case ---@param a string? , even though util.trim can trim all the spaces
    => an useless result.comment will still be generated

but we can never have the end char pos of a string >= comment.finish

If we can be certain that the length of comment.text is always <= comment.finish - comment.start,
then I agree that the convertCommentPosToLinePos(comment, cstart) < comment.finish check is pointless. 👍
As you said, in this case we can never have the converted trailing comment start pos >= comment.finish

…dLuaDoc, sync with semantic logic. Fix crash 'comment.clong' type for /*, fix typo docGeneric.
…o not try to process a multiline `result`, while `doc` is the first line.
@TIMONz1535
Copy link
Contributor Author

TIMONz1535 commented Jan 9, 2025

I have redone this.

  • Fix calculating the comment over doc not text in buildLuaDoc.

Actually we call parseTokens(doc, ...) and convertTokens(doc) so our result tokens is from doc, not comment.text. That's why we don't use text anymore.

local headPos = (comment.type == 'comment.short' and comment.text:match '^%-%s*@()')
or (comment.type == 'comment.long' and comment.text:match '^@()')
if not headPos then
return {
type = 'doc.comment',
start = comment.start,
finish = comment.finish,
range = comment.finish,
comment = comment,
}
end
-- absolute position of `@` symbol
local startOffset = comment.start + headPos
if comment.type == 'comment.long' then
startOffset = comment.start + headPos + #comment.mark - 2
end
local doc = comment.text:sub(headPos)
parseTokens(doc, startOffset)
local result, rests = convertTokens(doc)

The doc is the string after @ symbol, so I made a variable startOffset that means an absolute position of @ symbol.

startOffset for the comment.short

The startOffset for the comment.short should be comment.start + headPos + 2 - 2 but I simplified it. Here +2 is -- prefix offset and -2 is back from "1-based after @ pos" to "0-based of @ pos".

In cstart calculations I replaced the comment.start with startOffset, nothing else is needed.

Later I discovered exactly the same logic in the semantic-tokens.lua and synchronized them.

local headPos = (comm.type == 'comment.short' and comm.text:match '^%-%s*[@|]()')
or (comm.type == 'comment.long' and comm.text:match '^@()')
if headPos then
-- absolute position of `@` symbol
local startOffset = comm.start + headPos
if comm.type == 'comment.long' then
startOffset = comm.start + headPos + #comm.mark - 2
end
results[#results+1] = {
start = comm.start,
finish = startOffset,
type = define.TokenTypes.comment,
}
results[#results+1] = {
start = startOffset,
finish = startOffset + #comm.text:match('%S*', headPos) + 1,
type = define.TokenTypes.keyword,
modifieres = define.TokenModifiers.documentation,
}


  • Instead of the old check cstart < comment.finish, I made a new one finish >= comment.finish.

The old one implied that the comment.text end could be more than the comment.finish, but was written with error. I don't think that happens. I found a more common case where result can be a multiline annotation or an alias, while doc is the first line, so result.finish >>> comment.finish and we can't parse comment. In any case string.find prevented the error, but I wanted to make it an explicit check.

  • Fixed crash with a /* has no .mark field, it was type comment.long instead of comment.clong.

  • Fixed typo docGenric -> docGeneric.

  • Fix incorrect rest.range and finish on potentially multiline types.

  • I added %s* to the comment.long pattern ^%s*@() to support long string with leading spaces. But the string does not contain them due to a bug Parser. Long string doesn't contain leading spaces #3034.

--[===[    @param a string?ComMMS1]===]
local function new(a) end

This pattern is optional * and does not affect the current behavior.

My complete test
---@type integer, string, string Comment
local a, b, c

---@param a string?Comment
local function new(a) end

---    @param a string?ComSpa1
local function new(a) end

---@    param a string?ComSpa2
local function new(a) end

---    @    param a string?ComSpa3
local function new(a) end

--[[@param a string?ComMult]]
local function new(a) end

--[[    @param a string?ComMuS1]]
local function new(a) end

--[[@    param a string?ComMuS2]]
local function new(a) end

--[[    @    param a string?ComMuS3]]
local function new(a) end

--[===[@param a string?ComMMul]===]
local function new(a) end

--[===[    @param a string?ComMMS1]===]
local function new(a) end

--[===[@    param a string?ComMMS2]===]
local function new(a) end

--[===[    @    param a string?ComMMS3]===]
local function new(a) end

    ---  @param a string?        
    local function new(a) end

    --[[  @param a string?        ]]
    local function new(a) end

    --[===[  @param a string?        ]===]
    local function new(a) end

    ---  @param a string?        A
    local function new(a) end

    --[[  @param a string?        B]]
    local function new(a) end

    --[===[  @param a string?        C]===]
    local function new(a) end

Result

изображение

@TIMONz1535 TIMONz1535 marked this pull request as ready for review January 9, 2025 18:35
Copy link
Contributor

@tomlau10 tomlau10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that your latest changes can solve #3034 as well?
Then you can link it by adding this to the first line of original your PR description

fixes #3034 

https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword

This can let github close the linked issues on this PR merge 👀

script/parser/luadoc.lua Show resolved Hide resolved
script/core/semantic-tokens.lua Show resolved Hide resolved
@TIMONz1535
Copy link
Contributor Author

TIMONz1535 commented Jan 10, 2025

your latest changes can solve #3034 as well?

No, this is a potential support if the bug is resolved in the future (it will work automatically). But if is not, it does not affect anything.

Actually the pattern for short already contained optional spaces, but not for long ones. I did for the long ones as well.

@TIMONz1535
Copy link
Contributor Author

@tomlau10 can I just add mark = token field to comment.short, comment.cshort, comment.clong and don't think about the type of comment anymore?

This will eliminate the ambiguity, and we will no longer keep the -- prefix for comment.short in our head.

State.comms[#State.comms+1] = {
type = chead and 'comment.cshort' or 'comment.short',
start = left,
finish = getPosition(right, 'right'),
text = ssub(Lua, start + 2, right),
}
return true
end
if token == '/*' then
local start = Tokens[Index]
local left = getPosition(start, 'left')
Index = Index + 2
local result, right = resolveLongString '*/'
pushLongCommentError(left, right)
State.comms[#State.comms+1] = {
type = 'comment.clong',
start = left,
finish = right,
text = result,
}

@tomlau10
Copy link
Contributor

can I just add mark = token field to comment.short, comment.cshort, comment.clong and don't think about the type of comment anymore?

I don't think so 😕
By looking at the code, the concept of mark is the tokens that are quoting the content/body.
For example there is also a mark concept for string, which is single quote or double quote ' / ":

local function parseString(parent)
local tp, content = peekToken()
if not tp or tp ~= 'string' then
return nil
end
nextToken()
local mark = getMark()
-- compatibility
if content:sub(1, 1) == '"'
or content:sub(1, 1) == "'" then
if #content > 1 and content:sub(1, 1) == content:sub(-1, -1) then
mark = content:sub(1, 1)
content = content:sub(2, -2)
end
end
local str = {
type = 'doc.type.string',
start = getStart(),
finish = getFinish(),
parent = parent,
[1] = content,
[2] = mark,

Back to comment, for short comment there is no mark concept, as nothing is used to quote the comment body.
v.s. long comment it is the [[ / [====[ that quoting the comment body.


Maybe you can just keep this PR as is.
Our latest discussion about the formula is not too meaning 🙈
By Law of triviality we should not focus too much on things that are too trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants