-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lint: handle multiline jinja2 code #6552
base: 8.4.x
Are you sure you want to change the base?
Conversation
@oliver-sanders Please could you have a look at this proof of concept before I bother testing it? |
7284e7c
to
8335bd2
Compare
4477ce9
to
becc52e
Compare
becc52e
to
9191eae
Compare
9191eae
to
7437cfa
Compare
I think this is going to be the quickest way to get over the state barrier, but going statefull is a good opportunity to review our approach here before we lay down a pattern which ends up biting us later. I haven't really given much thought to this so far, I don't have working knowledge of modern linters, a bit of reading would be a good idea... Mid to long term, it depends how far we want to go with the linter. The initial idea was a quick and dirty line-by-line regex tool, but we thought we would leave it there (we decided it would not be statefull). Now we are clearly at the point where we want to develop this into something more serious. I think the optimal solution for this sort of problem is something more tokeniser'ey. I.e, a program that reads the file into labeled bits rather than lines, allowing rules to be run on each "bit" in a more structured way than regex would allow by itself. E.g, here's an example of how a tokeniser'ey type thing might see the file: >>> for type, content, state in tokeniser(file):
... print((type, content))
('shebang', '#!Jinja2', <file:flow.cylc, lines:[0], preproc:jinja2, path:>)
('blankline', '', <file:flow.cylc, lines:[1], preproc:jinja2, path:>)
('section', '[scheduling]', <file:flow.cylc, lines:[2], preproc:jinja2, path:[scheduling]>)
('config', ' initial cycle point = 2000', <file:flow.cylc, lines:[3], preproc:jinja2, path:[scheduling]initial cycle point>)
('cycle point', '2000', <file:flow.cylc, lines:[3], preproc:jinja2, path:[scheduling]initial cycle point>)
('section', ' [[graph]]', <file:flow.cylc, lines:[4], preproc:jinja2, path:[scheduling][graph]>)
('config', ' R1 = """\nfoo => bar\nbar => baz\n """', <file:flow.cylc, lines:[5,6,7,8], path:[scheduling][graph]>)
('graph', 'foo => bar\nbar => baz', <file:flow.cylc, lines:[6,7,8], path:[scheduling][graph]R1>)
('jinja2', '{% if ... \n ... %}', <file:flow.cylc, lines:[9,10], path:[scheduling][graph]>)
('jinja2', '{% include ... %}', <file:flow.cylc, lines:[11], path:?>) It's similar (but less detailed & more annotated) to what our syntax highlighters see. Note, we are carrying the Cylc config "path" where the information is available allowing us to apply specific rules to particular sections or configs and each entry could represent multiple lines. This would allow us, for example, to apply rules, only to graph sections, and to do clever things, like read a few lines ahead to check if different task configs are compatible. It might be relatively simple to get a paired-down version of this working, e.g something that has only awareness of "plain" Cylc lines and Jinja2 expressions. There might also be the option of building a linter into one of the Cylc grammars we have already developed? I wouldn't be surprised if this is becoming common in the LSP era. I mention these things largely to curtail the scope of what we do now. It might be that the Cylc linter never gets much further than it is now, in which case this is a pretty decent pattern. But it is also possible that we end up developing this much further and the community starts contributing rulesets. So we should be cautious of how far we push this pattern, if it's starting to look like this utill might balloon, we should be ready to change tack rather than bang our heads against the pattern we've got. If there's a pattern out there we could easily adopt, this might be a good opportunity to do so before we start trying to get clever with our line reader. And if we do go ahead with this (and I don't think that would be a bad option), we should be cautious over how far we push it. |
Much as I like the linter (I do!) we should think carefully, and maybe look at how widely it is used, before sinking a lot of effort into making it significantly more general than it is now (I think this gels with what Oliver is saying, but in more general terms). (Priorities vs resources, as ever...) |
We already have validate, and I'm against trying too hard with lint. If that means that eventually there are things we can't deal with, so be it. It's already a goodish way from "a sed script to identify 7 to 8 changes". |
Closes #6348
Closes #6507
Check List
CONTRIBUTING.md
and added my name as a Code Contributor.setup.cfg
(andconda-environment.yml
if present).?.?.x
branch.