-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workaround: lock ttl/expiration #20
Conversation
|
||
To work around that, `FLEISS_LOCK_TTL` (seconds) ENV variable can be set. This should be larger than maximum expected job perform time. | ||
|
||
Jobs that are started but not finished in `FLEISS_LOCK_TTL` seconds can be picked up by other workers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better English suggestions are welcome ^^
@@ -1,6 +1,6 @@ | |||
Gem::Specification.new do |s| | |||
s.name = 'fleiss' | |||
s.version = '0.4.4' | |||
s.version = '0.4.5' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No incompatible changes. This is a new, optional feature to be explicitly opted-in by setting FLEISS_LOCK_TTL
.
@@ -4,6 +4,8 @@ class ActiveRecord | |||
module Concern | |||
extend ActiveSupport::Concern | |||
|
|||
DEFAULT_LOCK_TTL = ENV['FLEISS_LOCK_TTL']&.to_i # seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No dynamic / official config (per-worker or so).
Considering this a tweak (even hack).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so now I see it, I don't like this. Let's add a proper heartbeart instead. How about:
- add a
lock_expires_at
column - not sure how to deal with the migration of this? - document a
def max_run_time
callback (like we do withdef ttl
) - set
lock_expires_at
whenjob.max_run_time
is set - much like https://github.com/bsm/fleiss/blob/main/lib/fleiss/backend/active_record/concern.rb#L46 - fix
def pending
scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find suggested solution viable but I don't like it too much (still sound workaround-ish):
- if you call it
max_run_time
- that sounds like you imply that job can be killed when this is reached (Time.now > started_at + max_run_time
) - if you call it proper
lock_ttl
(which better be done globally, per worker) - this is kinda closer, but also not worth much without a proper heartbeat in a separate thread (maybe one per-worker thread, not per job)
I tend to implement second option (one heartbeat thread per worker).
UPD: looks like https://www.rubydoc.info/gems/concurrent-ruby/Concurrent/TimerTask is the thing for it.
UPD2: actually, it cannot be one per worker, but like one per pooled thread because job "owner" is based on thread ID... How much deeper is the rabbit hole? :)
Btw, migrations are solved by create_table(...) unless table_exists?(...)
and add_column(...) unless column_exists?(...)
.
@@ -4,6 +4,8 @@ class ActiveRecord | |||
module Concern | |||
extend ActiveSupport::Concern | |||
|
|||
DEFAULT_LOCK_TTL = ENV['FLEISS_LOCK_TTL']&.to_i # seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so now I see it, I don't like this. Let's add a proper heartbeart instead. How about:
- add a
lock_expires_at
column - not sure how to deal with the migration of this? - document a
def max_run_time
callback (like we do withdef ttl
) - set
lock_expires_at
whenjob.max_run_time
is set - much like https://github.com/bsm/fleiss/blob/main/lib/fleiss/backend/active_record/concern.rb#L46 - fix
def pending
scope
.order(priority: :desc) | ||
.order(scheduled_at: :asc) | ||
|
||
pending.or(lock_expired(now, DEFAULT_LOCK_TTL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not right. if something is e.g. finished or expired, you don't want to include it!!
it should be:
not_finished
.not_exipred
.where(arel_table[:scheduled_at].lteq(now))
.merge(not_started(now).or(lock_expired(now))
.order(priority: :desc, scheduled_at: :asc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦
I'll close this one - too much to change, better re-submit once ready. |
Addresses / Solves #13