Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow square-bracket notation after anchor selector #105

Open
netbrothers-tr opened this issue Dec 5, 2024 · 0 comments
Open

Allow square-bracket notation after anchor selector #105

netbrothers-tr opened this issue Dec 5, 2024 · 0 comments

Comments

@netbrothers-tr
Copy link

netbrothers-tr commented Dec 5, 2024

Currently the XPathExpressionDiscoverer allows selectors ending with /a only. This means being more specific by using the square-bracket notation is not supported. However, this would make the spider so much more powerful and we wouldn't even have to change a lot.

An example of the square-bracket notation could be the following.

//a[starts-with(@href, '/') or starts-with(@href, '$url')]

To allow this, spider could either be less strict about the selector argument (maybe replacing endsWith with a regular expression) or move the validation of the selector argument away from the constructor (to a protected function maybe), such that when extending the XPathExpressionDiscoverer you could override such validation method and have your own selector validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant