Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't broadcast Checkable#next_check updates made just not to check twice #10093

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions lib/icinga/checkable-check.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ long Checkable::GetSchedulingOffset()
return m_SchedulingOffset;
}

void Checkable::UpdateNextCheck(const MessageOrigin::Ptr& origin)
void Checkable::UpdateNextCheck(const MessageOrigin::Ptr& origin, bool suppressEvents)
{
double interval;

Expand All @@ -75,7 +75,7 @@ void Checkable::UpdateNextCheck(const MessageOrigin::Ptr& origin)
<< "' from last check time at " << Utility::FormatDateTime("%Y-%m-%d %H:%M:%S %z", (lastCheck < 0 ? 0 : lastCheck))
<< " (" << GetLastCheck() << ") to next check time at " << Utility::FormatDateTime("%Y-%m-%d %H:%M:%S %z", nextCheck) << " (" << nextCheck << ").";

SetNextCheck(nextCheck, false, origin);
SetNextCheck(nextCheck, suppressEvents, origin);
}

bool Checkable::HasBeenChecked() const
Expand Down Expand Up @@ -563,11 +563,11 @@ void Checkable::ExecuteCheck()

SetLastCheckStarted(Utility::GetTime());

/* This calls SetNextCheck() which updates the CheckerComponent's idle/pending
/* This calls SetNextCheck() for a later update of the CheckerComponent's idle/pending
* queues and ensures that checks are not fired multiple times. ProcessCheckResult()
* is called too late. See #6421.
*/
UpdateNextCheck();
UpdateNextCheck(nullptr, true);
Comment on lines -566 to +570
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a later update of the CheckerComponent's idle/pending

When will this later update happen?

ProcessCheckResult() is called too late.

In particular, is it earlier than this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a later update of the CheckerComponent's idle/pending

When will this later update happen?

  1. First, CheckerComponent::ExecuteCheckHelper calls checkable->ExecuteCheck (https://github.com/Icinga/icinga2/blob/v2.14.2/lib/checker/checkercomponent.cpp#L233)
  2. That calls UpdateNextCheck (https://github.com/Icinga/icinga2/blob/v2.14.2/lib/icinga/checkable-check.cpp#L563)
  3. Finally, CheckerComponent::ExecuteCheckHelper updates the index (https://github.com/Icinga/icinga2/blob/v2.14.2/lib/checker/checkercomponent.cpp#L266)

ProcessCheckResult() is called too late.

In particular, is it earlier than this?

Imagine, you have a simple Python plugin doing some basic network I/O. I know that Python/C++ meme is a bit silly, but actually, compared to how quick CheckerComponent returns to its loop which gets the next item from this index, your Python plugin (and ProcessCheckResult) takes centuries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine, you have a simple Python plugin doing some basic network I/O. I know that Python/C++ meme is a bit silly, but actually, compared to how quick CheckerComponent returns to its loop which gets the next item from this index, your Python plugin (and ProcessCheckResult) takes centuries.

I don't get what this is trying to say but it sounds like you're describing a race condition. Are you trying to say it's not a problem because check plugins will be slow enough? But that sounds like the opposite of the comment, the slower the plugin, the later ProcessCheckResult() will be called. On the other hand, you can also get quickly failing checks by specifying a non-existent path for example so that executing it fails immediately. That shouldn't break Icinga 2 either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, no race condition. Checkable::ExecuteCheck calls first UpdateNextCheck, then GetCheckCommand()->Execute. The plugin can't fail earlier than UpdateNextCheck is called.

But, what I've written about: Because plugins are in general rather slow, ProcessCheckResult() comes with a latency. But the checker index needs SetNextCheck now(!!), that's why UpdateNextCheck is called. It's that simple IIRC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


bool reachable = IsReachable();

Expand Down Expand Up @@ -636,7 +636,7 @@ void Checkable::ExecuteCheck()
* a check result from the remote instance. The check will be re-scheduled
* using the proper check interval once we've received a check result.
*/
SetNextCheck(Utility::GetTime() + GetCheckCommand()->GetTimeout() + 30);
SetNextCheck(Utility::GetTime() + GetCheckCommand()->GetTimeout() + 30, true);

/*
* Let the user know that there was a problem with the check if
Expand Down
2 changes: 1 addition & 1 deletion lib/icinga/checkable.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ class Checkable : public ObjectImpl<Checkable>
long GetSchedulingOffset();
void SetSchedulingOffset(long offset);

void UpdateNextCheck(const MessageOrigin::Ptr& origin = nullptr);
void UpdateNextCheck(const MessageOrigin::Ptr& origin = nullptr, bool suppressEvents = false);

bool HasBeenChecked() const;
virtual bool IsStateOK(ServiceState state) const = 0;
Expand Down
Loading