Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The cronjobs (monitor_metrics*) get stuck and keep spawning when replica is taking a checkpoint #23

Open
hueyvle opened this issue Dec 18, 2021 · 5 comments

Comments

@hueyvle
Copy link

hueyvle commented Dec 18, 2021

I have to remove the p4prom installation and the cron jobs from replica because of this.

I have a quite a huge environment where taking a checkpoint would take 12h +
During that time, all perforce command would stuck, including the cron jobs.

Any work around for this?

@rcowham
Copy link
Contributor

rcowham commented Dec 18, 2021

Which version of the server are you running?
If it is p4d 2021.1 or later then we can turn on Realtime Monitoring for the server:
https://www.perforce.com/manuals/cmdref/Content/CmdRef/p4_monitor.html
The value of 'rtv.db.ckp.active' could be read by the script to detect checkpoint. If that's an option for you I am happy to look at putting detection for this situation in place.
Please note that we usually recommend doing offline checkpoints, as done by SDP. This avoids locking live database, even on a replica. https://community.perforce.com/s/article/2419
SDP: https://swarm.workshop.perforce.com/projects/perforce-software-sdp

@hueyvle
Copy link
Author

hueyvle commented Dec 18, 2021

We have p4d 2015.2. the size of db.* are huge (over 100G).
We have p4prometheus installed on master, and runs without any issue. However when installing it on replica, we realized that everything is frozen when checkpoint is running. (somewhat expected)

If monitor_metrics* script could detect the locking and stop spawning new job, that'd work.

@rcowham
Copy link
Contributor

rcowham commented Dec 18, 2021

Do you have lslocks installed?

@hueyvle
Copy link
Author

hueyvle commented Dec 18, 2021

yes I do.

@rcowham
Copy link
Contributor

rcowham commented Dec 18, 2024

Note the cron job can run as a systemd timer now - so that means always single threaded.
Will update installer to make this standard shortly.

@rcowham rcowham closed this as completed Dec 18, 2024
@rcowham rcowham reopened this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants