LuaDoc. Fixed the start position of the comment first symbol in docs #3028

TIMONz1535 · 2025-01-04T23:06:06Z

~~The positions were calculated slightly incorrectly there, because it got confused with the lengths.~~

---@param a string?Comment
local function new(a) end

---@param a string[]Comment
local function new(a) end

now its Comment instead of omment.

~~But I came across the fact that the comment texts comment.text do not contain -- at the beginning, because this token is uuuh... being skipped?~~

~~That is, we were working with a string of a different length and start pos (-2). I made a variable local textOffset = 2 so that it could be easily fixed in the future.~~

~~p.s. Maybe you should figure out why this is happening.~~

upd. Please see the latest message for actual changes.

script/parser/luadoc.lua

tomlau10 · 2025-01-05T14:29:55Z

But I came across the fact that the comment texts comment.text do not contain -- at the beginning, because this token is uuuh... being skipped?

Regarding this question, I think it makes sense because it stores the body of a comment 🤔

--foo bar

the comment body is only foo bar, not --foo bar
the prefix -- is not part of the comment body

TIMONz1535 · 2025-01-06T15:25:05Z

Damn, I found an issues

-- length = comment.finish - comment.start

-- length 26, relative finish 19, actual comment.text `-@param a string?Comment` (24 length, without 2)
-- ok
---@param a string?Comment

-- length 27, relative finish 18, actual comment.text `@param a string?Comment` (23 length, without 4!)
-- also real line length 29! `--` token was skipped before "calculating positions"
-- ok comment but wrong colorizing
--[[@param a string?Comment]]

 -- also wrong colorizing, so no comment issue
--[[@param a string?]]

-- length 33, relative finish 21, actual comment.text `@param a string?Comment` (23 length, without 10!!!)
-- wrong our offset 2, now comment text is `ment` :<
-- also wrong colorizing
--[===[@param a string[]Comment]===]

I could assume to search for the first character @, but most likely this function is not only for ---@stuff annotations?

Also the expression text:find('%S' seems pointless, because we're already doing util.trim (trimTailComment). Maybe it should have eliminated the empty spaces comments ---@param a string? . but we can never have the end char pos of a string >= comment.finish

tomlau10 · 2025-01-07T12:05:08Z

If you believe this PR is still not complete, I think the first thing is to turn it into draft:
=> this prevents maintainer accidentally merge the in progress PR 🤔 until everything sorts out
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/changing-the-stage-of-a-pull-request#converting-a-pull-request-to-a-draft

I will try to respond to the new found issues one by one when I got time 👀

tomlau10 · 2025-01-07T12:34:42Z

the unconsidered `comment.long` type, worked by chance

-- length 27, relative finish 18, actual comment.text `@param a string?Comment` (23 length, without 4!)
-- also real line length 29! `--` token was skipped before "calculating positions"
-- ok comment but wrong colorizing
--[[@param a string?Comment]]

so one of the new issues found is that, for comment.long type, although skipping 2 chars for it works, but the underline principle is different:
- the actual comment.text is only @param a string?Comment, at first glance we would expect 4 chars are skipped
- but no, comment.start is 2 in this case, which is what you are saying: "-- token was skipped before "calculating positions"
- -4 + 2 => it is still -2, so the current convertLinePosToCommentPos() works for this case
but it will not work when using nested quote style, in your next example 🐛

not working on nested quote comment

-- length 33, relative finish 21, actual comment.text `@param a string?Comment` (23 length, without 10!!!)
-- wrong our offset 2, now comment text is `ment` :<
-- also wrong colorizing
--[===[@param a string[]Comment]===]

🤔 as for this case, I notice that there is a comment.mark, which is actual extra skipped chars [===[ 💡

actually the existing logic already used this value when calculating startOffset 👀

lua-language-server/script/parser/luadoc.lua

Lines 1728 to 1731 in cb964c6

    
           local startOffset = comment.start 
        
           if comment.type == 'comment.long' then 
        
               startOffset = startOffset + #comment.mark - 2 
        
           end

by reviewing the case above --[[@param a string?Comment]], comment.mark = [[
=> this is indeed what we need 😄
=> so we can have a generalized form for the conversion functions pair:

local function convertLinePosToCommentPos(comment, linePos)
    -- some new comment explaining the new formula
    local textOffset = comment.type == 'comment.long' and #comment.mark or 2
    return linePos - comment.start + 1 - textOffset
end

local function convertCommentPosToLinePos(comment, commentPos)
    -- the reverse of convertLinePosToCommentPos()
    local textOffset = comment.type == 'comment.long' and #comment.mark or 2
    return commentPos + textOffset - 1 + comment.start
end

With this fix to both convertLinePosToCommentPos and convertCommentPosToLinePos, now all comments can be recognized correctly now, including your last test case --[===[@param a string[]Comment]===] 👍
But still the coloring for ? after string seems wrong 🤔 and currently I have no idea why.

tomlau10 · 2025-01-07T13:49:11Z

Also the expression text:find('%S' seems pointless, because we're already doing util.trim (trimTailComment). Maybe it should have eliminated the empty spaces comments ---@param a string? .

After some thought, this is not pointless 🤔

%S means non space characters
by using text:find('%S', <comment start offset>), we are filtering out empty comments
- cstart = nil if no valid comment string is found
- then the logic will not execute the if case below
- => therefore we can avoid generating empty result.comment objects
otherwise for the case ---@param a string? , even though util.trim can trim all the spaces
=> an useless result.comment will still be generated

but we can never have the end char pos of a string >= comment.finish

If we can be certain that the length of comment.text is always <= comment.finish - comment.start,
then I agree that the convertCommentPosToLinePos(comment, cstart) < comment.finish check is pointless. 👍
As you said, in this case we can never have the converted trailing comment start pos >= comment.finish

…dLuaDoc, sync with semantic logic. Fix crash 'comment.clong' type for /*, fix typo docGeneric.

…o not try to process a multiline `result`, while `doc` is the first line.

…(pattern is optional)

TIMONz1535 · 2025-01-09T18:03:41Z

I have redone this.

Fix calculating the comment over doc not text in buildLuaDoc.

Actually we call parseTokens(doc, ...) and convertTokens(doc) so our result tokens is from doc, not comment.text. That's why we don't use text anymore.

lua-language-server/script/parser/luadoc.lua

Lines 1716 to 1736 in 9ba27a6

    
           local headPos = (comment.type == 'comment.short' and comment.text:match '^%-%s*@()') 
        
                        or (comment.type == 'comment.long'  and comment.text:match '^@()') 
        
           if not headPos then 
        
               return { 
        
                   type    = 'doc.comment', 
        
                   start   = comment.start, 
        
                   finish  = comment.finish, 
        
                   range   = comment.finish, 
        
                   comment = comment, 
        
               } 
        
           end 
        
           -- absolute position of `@` symbol 
        
           local startOffset = comment.start + headPos 
        
           if comment.type == 'comment.long' then 
        
               startOffset = comment.start + headPos + #comment.mark - 2 
        
           end 
        
           local doc = comment.text:sub(headPos) 
        
           parseTokens(doc, startOffset) 
        
           local result, rests = convertTokens(doc)

The doc is the string after @ symbol, so I made a variable startOffset that means an absolute position of @ symbol.

startOffset for the comment.short

The startOffset for the comment.short should be comment.start + headPos + 2 - 2 but I simplified it. Here +2 is -- prefix offset and -2 is back from "1-based after @ pos" to "0-based of @ pos".

In cstart calculations I replaced the comment.start with startOffset, nothing else is needed.

Later I discovered exactly the same logic in the semantic-tokens.lua and synchronized them.

lua-language-server/script/core/semantic-tokens.lua

Lines 906 to 924 in 9ba27a6

    
           local headPos = (comm.type == 'comment.short' and comm.text:match '^%-%s*[@|]()') 
        
                        or (comm.type == 'comment.long'  and comm.text:match '^@()') 
        
           if headPos then 
        
               -- absolute position of `@` symbol 
        
               local startOffset = comm.start + headPos 
        
               if comm.type == 'comment.long' then 
        
                   startOffset = comm.start + headPos + #comm.mark - 2 
        
               end 
        
               results[#results+1] = { 
        
                   start  = comm.start, 
        
                   finish = startOffset, 
        
                   type   = define.TokenTypes.comment, 
        
               } 
        
               results[#results+1] = { 
        
                   start      = startOffset, 
        
                   finish     = startOffset + #comm.text:match('%S*', headPos) + 1, 
        
                   type       = define.TokenTypes.keyword, 
        
                   modifieres = define.TokenModifiers.documentation, 
        
               }

Instead of the old check cstart < comment.finish, I made a new one finish >= comment.finish.

The old one implied that the comment.text end could be more than the comment.finish, but was written with error. I don't think that happens. I found a more common case where result can be a multiline annotation or an alias, while doc is the first line, so result.finish >>> comment.finish and we can't parse comment. In any case string.find prevented the error, but I wanted to make it an explicit check.

Fixed crash with a /* has no .mark field, it was type comment.long instead of comment.clong.
Fixed typo docGenric -> docGeneric.
Fix incorrect rest.range and finish on potentially multiline types.
I added %s* to the comment.long pattern ^%s*@() to support long string with leading spaces. But the string does not contain them due to a bug Parser. Long string doesn't contain leading spaces #3034.

--[===[    @param a string?ComMMS1]===]
local function new(a) end

This pattern is optional * and does not affect the current behavior.

My complete test

---@type integer, string, string Comment
local a, b, c

---@param a string?Comment
local function new(a) end

---    @param a string?ComSpa1
local function new(a) end

---@    param a string?ComSpa2
local function new(a) end

---    @    param a string?ComSpa3
local function new(a) end

--[[@param a string?ComMult]]
local function new(a) end

--[[    @param a string?ComMuS1]]
local function new(a) end

--[[@    param a string?ComMuS2]]
local function new(a) end

--[[    @    param a string?ComMuS3]]
local function new(a) end

--[===[@param a string?ComMMul]===]
local function new(a) end

--[===[    @param a string?ComMMS1]===]
local function new(a) end

--[===[@    param a string?ComMMS2]===]
local function new(a) end

--[===[    @    param a string?ComMMS3]===]
local function new(a) end

    ---  @param a string?        
    local function new(a) end

    --[[  @param a string?        ]]
    local function new(a) end

    --[===[  @param a string?        ]===]
    local function new(a) end

    ---  @param a string?        A
    local function new(a) end

    --[[  @param a string?        B]]
    local function new(a) end

    --[===[  @param a string?        C]===]
    local function new(a) end

Result

tomlau10

Seems that your latest changes can solve #3034 as well?
Then you can link it by adding this to the first line of original your PR description

fixes #3034

https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword

This can let github close the linked issues on this PR merge 👀

script/parser/luadoc.lua

script/core/semantic-tokens.lua

TIMONz1535 · 2025-01-10T13:22:10Z

your latest changes can solve #3034 as well?

No, this is a potential support if the bug is resolved in the future (it will work automatically). But if is not, it does not affect anything.

Actually the pattern for short already contained optional spaces, but not for long ones. I did for the long ones as well.

TIMONz1535 · 2025-01-10T23:41:11Z

@tomlau10 can I just add mark = token field to comment.short, comment.cshort, comment.clong and don't think about the type of comment anymore?

This will eliminate the ambiguity, and we will no longer keep the -- prefix for comment.short in our head.

lua-language-server/script/parser/compile.lua

Lines 578 to 597 in 9ba27a6

    
               State.comms[#State.comms+1] = { 
        
                   type   = chead and 'comment.cshort' or 'comment.short', 
        
                   start  = left, 
        
                   finish = getPosition(right, 'right'), 
        
                   text   = ssub(Lua, start + 2, right), 
        
               } 
        
               return true 
        
           end 
        
           if token == '/*' then 
        
               local start = Tokens[Index] 
        
               local left = getPosition(start, 'left') 
        
               Index = Index + 2 
        
               local result, right = resolveLongString '*/' 
        
               pushLongCommentError(left, right) 
        
               State.comms[#State.comms+1] = { 
        
                   type   = 'comment.clong', 
        
                   start  = left, 
        
                   finish = right, 
        
                   text   = result, 
        
               }

tomlau10 · 2025-01-11T01:44:33Z

can I just add mark = token field to comment.short, comment.cshort, comment.clong and don't think about the type of comment anymore?

I don't think so 😕
By looking at the code, the concept of mark is the tokens that are quoting the content/body.
For example there is also a mark concept for string, which is single quote or double quote ' / ":

lua-language-server/script/parser/luadoc.lua

Lines 697 to 719 in cb964c6

    
           local function parseString(parent) 
        
               local tp, content = peekToken() 
        
               if not tp or tp ~= 'string' then 
        
                   return nil 
        
               end 
        
               nextToken() 
        
               local mark = getMark() 
        
               -- compatibility 
        
               if content:sub(1, 1) == '"' 
        
               or content:sub(1, 1) == "'" then 
        
                   if #content > 1 and content:sub(1, 1) == content:sub(-1, -1) then 
        
                       mark = content:sub(1, 1) 
        
                       content = content:sub(2, -2) 
        
                   end 
        
               end 
        
               local str = { 
        
                   type   = 'doc.type.string', 
        
                   start  = getStart(), 
        
                   finish = getFinish(), 
        
                   parent = parent, 
        
                   [1]    = content, 
        
                   [2]    = mark,

Back to comment, for short comment there is no mark concept, as nothing is used to quote the comment body.
v.s. long comment it is the [[ / [====[ that quoting the comment body.

Maybe you can just keep this PR as is.
Our latest discussion about the formula is not too meaning 🙈
By Law of triviality we should not focus too much on things that are too trivial.

tomlau10 reviewed Jan 5, 2025

View reviewed changes

script/parser/luadoc.lua Outdated Show resolved Hide resolved

TIMONz1535 mentioned this pull request Jan 6, 2025

Wrong colorizing with generic pattern and long string annotations #3032

Open

TIMONz1535 marked this pull request as draft January 7, 2025 13:00

TIMONz1535 mentioned this pull request Jan 9, 2025

Parser. Long string doesn't contain leading spaces #3034

Open

TIMONz1535 added 3 commits January 9, 2025 22:04

LuaLS#3028. Fix calculating the comment over doc not text in buil…

9ba27a6

…dLuaDoc, sync with semantic logic. Fix crash 'comment.clong' type for /*, fix typo docGeneric.

LuaLS#3028. Fix incorrect rest.range and finish on multiline types. D…

cbe45e1

…o not try to process a multiline `result`, while `doc` is the first line.

LuaLS#3028. Support for a long strings with leading spaces LuaLS#3034 …

4af79e3

…(pattern is optional)

TIMONz1535 force-pushed the TIMONz1535-patch-3 branch from 537aaed to 4af79e3 Compare January 9, 2025 17:15

TIMONz1535 marked this pull request as ready for review January 9, 2025 18:35

tomlau10 reviewed Jan 10, 2025

View reviewed changes

script/parser/luadoc.lua Show resolved Hide resolved

script/core/semantic-tokens.lua Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LuaDoc. Fixed the start position of the comment first symbol in docs #3028

LuaDoc. Fixed the start position of the comment first symbol in docs #3028

TIMONz1535 commented Jan 4, 2025 •

edited

Loading

tomlau10 commented Jan 5, 2025

TIMONz1535 commented Jan 6, 2025 •

edited

Loading

tomlau10 commented Jan 7, 2025

tomlau10 commented Jan 7, 2025 •

edited

Loading

tomlau10 commented Jan 7, 2025 •

edited

Loading

TIMONz1535 commented Jan 9, 2025 •

edited

Loading

tomlau10 left a comment

TIMONz1535 commented Jan 10, 2025 •

edited

Loading

TIMONz1535 commented Jan 10, 2025

tomlau10 commented Jan 11, 2025

LuaDoc. Fixed the start position of the comment first symbol in docs #3028

Are you sure you want to change the base?

LuaDoc. Fixed the start position of the comment first symbol in docs #3028

Conversation

TIMONz1535 commented Jan 4, 2025 • edited Loading

upd. Please see the latest message for actual changes.

tomlau10 commented Jan 5, 2025

TIMONz1535 commented Jan 6, 2025 • edited Loading

tomlau10 commented Jan 7, 2025

tomlau10 commented Jan 7, 2025 • edited Loading

the unconsidered comment.long type, worked by chance

not working on nested quote comment

tomlau10 commented Jan 7, 2025 • edited Loading

TIMONz1535 commented Jan 9, 2025 • edited Loading

tomlau10 left a comment

Choose a reason for hiding this comment

TIMONz1535 commented Jan 10, 2025 • edited Loading

TIMONz1535 commented Jan 10, 2025

tomlau10 commented Jan 11, 2025

TIMONz1535 commented Jan 4, 2025 •

edited

Loading

TIMONz1535 commented Jan 6, 2025 •

edited

Loading

tomlau10 commented Jan 7, 2025 •

edited

Loading

the unconsidered `comment.long` type, worked by chance

tomlau10 commented Jan 7, 2025 •

edited

Loading

TIMONz1535 commented Jan 9, 2025 •

edited

Loading

TIMONz1535 commented Jan 10, 2025 •

edited

Loading