Back to Blog
Guides
Cron Job Monitoring: Stop Silent Failures
Your cron jobs fail silently more often than you think. Heartbeat monitoring catches missed runs before they become disasters.
cronmonitoringdevopsheartbeat
The Silent Failure Problem
Cron jobs are the unsung heroes of your infrastructure. They handle:
- Database backups
- Report generation
- Data synchronisation
- Cache warming
- Email queue processing
When they fail, nobody notices — until it's too late.
How Heartbeat Monitoring Works
Instead of actively checking your cron job, UptimeGuard waits for a heartbeat ping from your job:
# Add to the end of your cron job
curl -fsS --retry 3 https://hb.uptimeguard.com/YOUR_MONITOR_ID
# Full example: daily backup with heartbeat
0 2 * * * /usr/local/bin/backup.sh && curl -fsS https://hb.uptimeguard.com/abc123
If the ping doesn't arrive within the expected window, UptimeGuard alerts you.
Alert Conditions
- Late: Job ran but took longer than expected
- Missing: Job didn't run at all
- Failed: Job ran but exited with an error (exit code != 0)
Best Practices
- Set grace periods — Allow 10–20% buffer for natural variation
- Monitor execution time — A job that takes 10x longer than usual is a warning sign
- Use exit codes — Send different heartbeat URLs for success vs failure
- Alert on the first miss — Don't wait for multiple failures
Common Pitfalls
- Cron timezone mismatches (always use UTC)
- Environment variables not loaded in cron context
- Disk space exhaustion causing silent write failures
- Log rotation deleting error evidence
Start monitoring your cron jobs for free with UptimeGuard.
MJ
Written by
Marcus Johnson
SRE Lead at UptimeGuard. 10+ years in infrastructure and reliability.