Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add per input rate limit. #62

Open
kevthehermit opened this issue Jan 23, 2019 · 7 comments
Open

[Feature] Add per input rate limit. #62

kevthehermit opened this issue Jan 23, 2019 · 7 comments
Assignees

Comments

@kevthehermit
Copy link
Owner

Sites like slexy have heavy rate limits. - #52

Add a configurable rate limit per input source

@alesandroortiz
Copy link
Contributor

Hello @kevthehermit,

I'm the maintainer of Slexy.org. I opened a PR a few minutes ago to help remedy some Slexy Terms of Service violations. The rate limiting/throttling is the next step, but I am unable to implement this. Is this in the short-term project roadmap?

Lack of throttling + excessive HTTP -> HTTPS redirects + non-identifying UA have all led to a lot of folks being banned. Lack of throttling still causes temporary or permanent bans. e.g. #52

My PR: #94

Regards,
@thephpjedi

@Plazmaz Plazmaz self-assigned this Dec 1, 2019
@Plazmaz
Copy link
Collaborator

Plazmaz commented Dec 1, 2019

FYI @thephpjedi this will likely be a significant lift due to the multithreading we're doing and require a large rewrite of a pretty core part of the application. I have been poking at it, but it will take a while.

If you wanted, you could make it a LOT easier by simply imposing a ratelimit and returning the remaining requests in a header, similar to how github handles things, which would avoid the need for us to track and lock the remaining ratelimit on our side, and would make this restriction more obvious to developers in the future. Not a necessity, but would make this happen a lot more easily if you enforced that limit on your side by simply not returning data.
It'd also be less nuclear than banning, which would be nice (although you could reduce this value or ban serial abusers) Thread safety is a pain.

@alesandroortiz
Copy link
Contributor

alesandroortiz commented Dec 5, 2019

Hi @Plazmaz,

Thanks for the additional context.

Glad to provide rate limit info via headers, especially if it makes it easier to implement throttling on your end. I also plan to return HTTP status code 429 (Too Many Requests) when limit is exceeded in most cases, instead of dropping packets in all cases. Both would result in better visibility and communication to users and developers.

These items were recently added to my roadmap, but I'll prioritize them for sooner so you and others can benefit due to the interest received in recent weeks.

I'll comment here once this is implemented. Can't promise anything earlier than end of January, but ambitiously hope to get it done now in December.

@Plazmaz
Copy link
Collaborator

Plazmaz commented Dec 5, 2019

I really appreciate it, that will make life a lot easier. Thanks!

@alesandroortiz
Copy link
Contributor

alesandroortiz commented Jan 18, 2020

Hi @Plazmaz,

This week I implemented rate limiting logic on Slexy.org, largely aligned with earlier comments. Full list of current rate limits and acceptable requests are in the Slexy.org Terms of Service.

This is subject to change while I fine-tune the logic, but as of Friday evening the relevant behavior is:

  • Max 30 requests per 30 seconds allowed.

  • Max 3 HTTP requests allowed per 60 seconds. (See PR Update Slexy input to use HTTPS, unique UA #94.)

  • Any requests which exceed a rate limit will have HTTP status code 429 returned, with the specific reason provided in the response document.

  • All responses include RateLimit-Limit and RateLimit-Remaining headers. For non-429 HTTPS responses, all-request values are used (RateLimit-Limit: 30). For HTTP responses, HTTP-request values are used (RateLimit-Limit: 3). Other types of requests will use the appropriate values.

    Currently, these headers provide values for the all-requests limit (30 reqs per 30 secs). This should be sufficient info for your purposes, but please note I may return different values for HTTP requests soon (e.g. HTTPS: RateLimit-Limit: 30, RateLimit-Remaining: 29; HTTP: RateLimit-Limit 3, RateLimit-Remaining 2).

  • Status 429 responses also include a Retry-After header. Currently, the value is always 60 even if the cooldown period is shorter. This may change in the future, but waiting 60 seconds is more than enough and ideally you should not be hitting rate limits in the first place.

Let me know if you encounter any unexpected header values, because a few days ago I had some edge cases which resulted in undefined behavior (since fixed, but I may have other unidentified edge cases). Feel free to send feedback here or via contact form.

@Plazmaz
Copy link
Collaborator

Plazmaz commented Jan 18, 2020

Hey, thanks for the update.
I will attempt to take a whack at this as soon as I get some spare time. I really appreciate the effort you've put in to make this easier for us, and I'll update this issue with any challenges or strange behaviors I run into.
Thanks!

@Plazmaz
Copy link
Collaborator

Plazmaz commented Feb 15, 2020

@thephpjedi we should now be compliant with your ToS and ratelimit (as of 1.3.0).

Moving back on topic for this issue, the best approach will probably be with a base object for paste inputs. This is something I've wanted to do for a while (central objects for inputs, outputs, and postprocessing will remove a LOT of redundant logic)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants