Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow sorting schema properties by name #78

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

bodograumann
Copy link

Currently it is not possible to sort properties, schemas or responses by name, because any order defined for them is applied to the nested objects instead.
In order to distinguish between direct ordering of an object and nested ordering of its keys, I have introduced a new notation properties[*] here akin to JSONPath as used by Spectral.
Ideally the sort order would use complete path expressions, but that will take much more effort.

I know this is a breaking change for anyone using a custom sort order, but I think it is worthwhile.
The current implementation is limited and I found it difficult to understand how a custom sortFile even works. First I used trial-and-error to see what can and can't be sorted and then I looked at the source code.
E.g. the default sort order defines content: [], which means sort anything under a content key alphabetically and it has properties: […], which means sort the fields of all objects nested inside the properties key.

  • Simplify branching in sort function
  • Explicitly distinguish between direct and nested sorting
  • Test sorting schema properties by name

@ed-randall-blk
Copy link

ed-randall-blk commented Jul 7, 2023

Please could you clarify, how do you configure it so that:

  1. The parameters on each endpont path are sorted by arg name
  2. For each parameter the property names are sorted as expected
    I thought I understood your patch and tried
  "parameters[*]": ["name", "description", "in", "type", "format", "required", "default", "maximum", "minimum", "schema"],
  "parameters": ["name"]

and (2) is happening but (1) is not

@thim81
Copy link
Owner

thim81 commented Jul 9, 2023

@bodograumann

Thanks for the PR, I still have to find time to review the code and the impact of the breaking changes.
I understand the need to sort properties, schemas or responses by name, is the additional test https://github.com/thim81/openapi-format/pull/78/files#diff-2ed84511e150b344f99bcb395066fc8fd3b8a9e7580406e99a78ace4ee63fa85 a good example for me to fully understand the expected behaviour?

@bodograumann
Copy link
Author

Good catch @ed-randall-blk.
Turns out the current unit tests cannot actually test the order of keys in dictionaries, because the loadTest util parses the json/yaml file into an object before comparing actual vs expected.
Let me try to fix this.

@bodograumann bodograumann force-pushed the sort-schema-properties branch from 7345a8c to 32afa53 Compare July 10, 2023 12:17
@bodograumann
Copy link
Author

Ok, I added a sort implementation for arrays now, @ed-randall-blk (e.g. of parameters)
Beforehand I only sorted objects by keys, e.g. properties.
Giving the confusion this caused, I feel adding it is the right decision.

You are right, @thim81, that the new test should demonstrate the added/changed behaviour.
I also added the limit parameter there now, so that sorting an arry of parameters is shown as well.

@ed-randall-blk
Copy link

This is really good. I think it's doing exactly what we need, your efforts are much appreciated, thankyou.
To clarify our use-case:
We have an openapi schema generated from python apispec;
We deploy this config into Apigee via a PR process as a final confidence check as to what's changed (and to help with writing a release note). Unfortunately the order of everything in the .json has been indeterminate up to now, and the size has grown making the PR review almost impossible.
Using this tool to massage the file into a determinate ordering will help no end.

@ed-randall-blk
Copy link

Two issues remain for our use-case:

  1. Is there a way to sort the "paths"? The paths aren't an array, and each name is unique...
  2. The "root" only sorts if there's an "openapi" key present; Some of our files are missing this but have "swagger": "2.0" instead.

@bodograumann
Copy link
Author

You bring up some good ideas, @ed-randall-blk , but I think they would blow up the scope of this PR too much.

Sorting paths could work similar to the component order. There the definitions are ordered by the key. Unfortunately the paths are not contained within the components. So we need some new idea that won't make the configuration much more complex.

Not sure what is behind the second point, but I extpect it to be not too difficult to fix. Still better do it in a separate PR.

@ed-randall-blk
Copy link

ed-randall-blk commented Aug 7, 2023

Since writing that I have understood the behaviour of this better;

  1. Can apparently get paths to sort by name simply by adding:
    "paths: []"
    I'd not have expected this from the documentation, I'd have expected the order to preserve, but I'm happy.
  2. I first patched the code locally but later figured out how to make python apispec add an "openapi: 2.0" tag into the generated swagger.json doc to achieve this without any issues.
    We now have a working "normalization" config as:
{
  "root": ["openapi", "info", "servers", "basePath", "definitions", "parameters", tags", "paths"],
  "paths": [],
  "get": ["operationId", "description", "parameters", "requestBody", "responses", "summary", "tags"],
  "post": ["operationId", "description", "parameters", "requestBody", "responses", "summary", "tags"],
  "put": ["operationId", "description", "parameters", "requestBody", "responses", "summary", "tags"],
  "patch": ["operationId", "description", "parameters", "requestBody", "responses", "summary", "tags"],
  "delete": ["operationId","description", "parameters", "requestBody", "responses", "summary", "tags"],
  "parameters[*]": ["name", "description", "in", "type", "format", "required", "default", "maximum", "minimum", "schema"],
  "parameters": ["name"],
  "properties": []
}

we can run our generated files through this and review/compare changes consistently much more easily than before.

@thim81
Copy link
Owner

thim81 commented Aug 9, 2023

@ed-randall-blk @bodograumann

Following up on the progress and the findings.

  1. Is the PR still needed or is using an [] the solution? If yes, I'll document this in the readme for future users.
    Since I'm cautious of introducing the new [*] syntax because it feels more complex to configure and would potentially break existing configurations.
  2. The "openapi" property is needed to detect the "ROOT" level to apply the sorting of the root elements, plus I assumed that the "openapi" would always be present in OpenAPI documents since openapi-format was supporting: OpenAPI 3.0 & 3.1.
    Is it compatible with "swagger": "2.0"?

@bodograumann
Copy link
Author

Yes this PR is definitely still needed to be able to sort these things.
E.g. properties is an array of objects.
We want to sort this array in some way; according to a key in the objects.
Then we also want to sort the keys in each property object in a fixed order.
These two operations need separate configuration and without this PR it is not possible to sort the array at all.

I tried to keep the breakage to a minimum.
Another approach would be to define a completely separate new sorting method, so users could opt-in to the new, more powerful approach, while existing use cases would not be affected. This would also allow us to do more drastic redesigns of the sorting configuration.
That would require a bit more effort in terms of duplicating tests though.

What do you think, @thim81 ?

@ed-randall-blk
Copy link

ed-randall-blk commented Aug 10, 2023

The "swagger.json" (swagger 2.0) document that I'm working with seems fully compatible with the tool (for sorting purposes, at least) other than that it has "swagger: 2.0" instead of "openapi: 3.0" at the top; I tried to update the python to add both tags - but this fails the swagger validation :

- .openapi in body is a forbidden property

So it would be useful if openapi-format would recognise either openapi or swagger tag at the root.

On the sorting front, yes we've been using a fork with @bodograumann 's patch applied to achieve stable ordering / normalization of the file.

@thim81
Copy link
Owner

thim81 commented Nov 1, 2023

hi @bodograumann & @ed-randall-blk

Re-opening the discussion.
I had already foreseen a mechanism to sort by using the --sortComponentsFile option.
By defining which elements that should be sorted, the CLI will sort the elements alphabetically.

Have a look at the tests:

Given that this option, will this solve your use-case?

@bodograumann
Copy link
Author

I'll look into it, @thim81 , but it is really confusing when sortComponentsFile sorts thing that are not within the components key.

@bodograumann
Copy link
Author

Let me be a bit more in-depth.
Here are the things that should be sortable imho, together with some examples:

  1. Arrays of objects alphabetically by a value of a specific key in the children
    • parameters of a rest operation
    • top-level server and others
  2. Arrays of strings alphabetically by their values
    • enum options
    • required fields
  3. Objects with a fixed set of keys by listing the key order
    • top level info and others
    • any path (keys are the rest verbs like get: and parameters:)
    • any endpoint definition below the rest verbs
    • all kind of definitions that you would find under components, but also inline under the paths
  4. Maps with a variable key alphabetically by the key
    • properties of on object schema
    • responses with the response code as key
    • responseBodies with the content-type of key
    • all children of components

Going over the current openapi-format.js in main:

  • Lines 53 - 62:
    Any child of components can be sorted according to (4) above, when defined in sortComponents. This does not include any of the examples in (4) outside of components though.
  • Lines 63 - 71:
    Any array outside of components can be sorted according to (1), when defined in sortComponents, but only by the name key of the items, which does not always exist (e.g. servers). Furthermore there can be list of parameters nested inside reusable components.pathItems. So again (1) is not fully covered.
  • Lines 72 - 74:
    Here we start sorting by sort instead of sortComponents, when the current key is configured there.
  • Lines 75 - 88:
    Objects are sorted here by an order on the keys as per (3), but only when they are nested inside an array. This is so that we can associate them with the identifier in the configuration.
  • Lines 89 - 98:
    Sort the children of responses, schemas and properties according to (3). So here the parent is not an array as directly before, but an object. This fails to cover the children in paths for example.
  • Lines 99 - 110:
    This also covers (3), but at this point associating the order not via the key of the parent node, but by the key of the object itself.

All in all I must say that this part of the code is quite convoluted and hard to work on. That is why I tried to clean it up a bit in this PR.

The main point of this PR was to solve the issue in the last two blocks. Any order defined for responses, schemas and properties is applied to the keys of their children. The parent objects themselves are never sorted alphabetically by their keys. That is why I introduced the [*] notation to make it clearer by which key (parent or self) we configure the order in (3).
The issue persists on current main.

Later, by @ed-randall-blk's request, I also added support for (1), which might have increased the scope of this PR a bit too much, I must admit. My array sorting also allows sorting by multiple fields, which again might be overkill.
(2) is not covered at all yet afaics.

What do yo think about introducing a completely new sorting configuration, besides the current one?

  • when any of the existing two sorting configs exist, ignore the new one (maybe until the next major version)

  • sensible defaults that sort almost everything, so the user will rarely have to define it

  • just a single file instead of two

  • auto-detect the config in the current workdir or npm project; no need to give an explicit parameter

  • syntax that clearly shows what is being sorted and how, e.g. in the order (1) to (4) from above:

    {
        "sortChildrenByAttribute": {
            "parameters": "name"
        },
        "sortChildrenByValue": ["enum", "required"],
        "sortKeysInOrder": {
            "info": ["title", "description", "version"],
            "schema": ["type", "items", "parameters"],
            "get": ["summary", "description", "operationId", "responses"]
        },
        "sortChildKeysInOrder": {
            "paths": ["get", "put", "patch", "post"],
            "parameters": ["name", "in", "description", "schema"],
            "schemas": ["type", "items", "parameters"]
        },
        "sortKeysAlphabetically": ["paths", "properties", "responses"]
    }

    sortChildKeysInOrder covers both the children of objects as well items in arrays.

  • json-schema to validate that config file

Sidenote:
One other though was, that we might not have to be as generic in our sorting, but could only allow configuring orders for the exact types and attributes that occur in the openapi specification. This would also remove any redundancy e.g. between children of components.schemas and inline schema definitions, where we now have to define the key order two times.
On the other hand this would bind us very tightly to the details of the openapi spec and any change there could have implications for us.
The bigger concern would probably be assembling all the possible keys and types that could be configured instead of just going by arbitrary string keys.
That is why I still prefer the current approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants