Guides

Cron Job Monitoring: Stop Silent Failures

Your cron jobs fail silently more often than you think. Heartbeat monitoring catches missed runs before they become disasters.

MJ

Marcus Johnson

September 25, 20256 min read3,450 views

Share

cronmonitoringdevopsheartbeat

The Silent Failure Problem

Cron jobs are the unsung heroes of your infrastructure. They handle:

Database backups
Report generation
Data synchronisation
Cache warming
Email queue processing

When they fail, nobody notices — until it's too late.

How Heartbeat Monitoring Works

Instead of actively checking your cron job, UptimeGuard waits for a heartbeat ping from your job:

# Add to the end of your cron job
curl -fsS --retry 3 https://hb.uptimeguard.com/YOUR_MONITOR_ID

# Full example: daily backup with heartbeat
0 2 * * * /usr/local/bin/backup.sh && curl -fsS https://hb.uptimeguard.com/abc123

If the ping doesn't arrive within the expected window, UptimeGuard alerts you.

Alert Conditions

Late: Job ran but took longer than expected
Missing: Job didn't run at all
Failed: Job ran but exited with an error (exit code != 0)

Best Practices

Set grace periods — Allow 10–20% buffer for natural variation
Monitor execution time — A job that takes 10x longer than usual is a warning sign
Use exit codes — Send different heartbeat URLs for success vs failure
Alert on the first miss — Don't wait for multiple failures

Common Pitfalls

Cron timezone mismatches (always use UTC)
Environment variables not loaded in cron context
Disk space exhaustion causing silent write failures
Log rotation deleting error evidence

Start monitoring your cron jobs for free with UptimeGuard.

Share

MJ

Written by

Marcus Johnson

SRE Lead at UptimeGuard. 10+ years in infrastructure and reliability.

Related articles

SSL Certificate Monitoring: A Complete Guide

Expired SSL certificates cause outages and erode user trust. Learn how to monitor and automate certificate renewals.