-
Notifications
You must be signed in to change notification settings - Fork 27
xpath query is not validated #332
Comments
Yes, the library isn't ideal. We should fix it upstream. But also there were few discussions that XPath might not be the best language going forward, so maybe we will just use a new language? Will need to decide at some point. |
Ok. I'm wondering if graphql would be a good choice here. |
Real GraphQL needs a srits schema and we don't have one by design. But I've implemented a GraphQL endpoint with dynamic schema for Cayley, so in practice it can be done. The main problem is that GraphQL still cannot change the shape of a response. If you are querying multiple nested nodes, but only want few value extracted from them, you will end up requesting a skeleton of that subtree with few values in there. We need to check if there are any alternatives that can allow querying exactly what the user needs. |
Keep in mind that we do support, recommend and demo XPath today. Feel free to open a separate issue and start discussing proposals for the future, but this is a real issue today. |
Since the errors on the test code always say the same, so it's not clear to me what of the examples are supossed to be right or wrong, but I've tested with an online XPath validator. I've also tested using the client-python v3 (which uses the SDK and thus the library).
|
@kuba-- @ajnavarro if you guys have any thoughts on results of @juanjux preliminary research - please feel free to share |
For me it's all about optimization. If we can reject wrong query earlier instead of waiting till the end to say "no results" (because query is no valid) then it's better. In other words we'll take what LA team give us ;) The open question is - do we wanna change 3rd library to parse xpath, or stick to what we use so far? |
@kuba-- totally agree. I think the problem here is that almost any string is a valid XPath query. Maybe we should add some extra validation for our use case, like the supported node names. Per example, if the user tries to query |
We may introduce waterfall of validators. XPath library should validate just xpath. So, bblfsh will be sure that got valid xpath and can focus on validating own syntax and reject e.g. non-uast queries: |
This will require us to have a strict schema exposed to XPath. And before even trying this we will need to refactor XPath library to allow such hooks to be installed. |
No worries, I can help ;) |
In general, I think it's a good idea, but we are not ready for it right now. For example "our" XPath is well defined for Semantic nodes ( |
So let's do it in "baby steps". Personally, I prefer many small features/tools which we can add one by one instead of one big 🎉 effect. |
Totally agree! We can start by making an XPath library better. It's a bit messy right now. |
Talking about "baby steps", how about changing the current xpath library ( At least in following test it can find the last invalid xpath: package main
import (
. "gopkg.in/xmlpath.v2"
// . "github.com/antchfx/xpath"
)
func main() {
MustCompile("//*[@role=\"Return\"]")
MustCompile("BAD")
MustCompile("this is a totally crap. For sure not valid @xpath") // <----
} |
I checked the library briefly, it looks better but has one downside: it cannot work with virtual nodes, only with full XML tree. Still, it may be an option to fork the library and make it work with some node interface instead of a specific type. |
The issue is related to:
Generally SDK uses 3rd-party package (github.com/antchfx/xpath) to compile xpath queries.
Unfortunately lot of incorrect xpaths passes, e.g.:
The text was updated successfully, but these errors were encountered: