UptimeGuard
Back to Blog
Guides

Cron Job Monitoring: Stop Silent Failures

Your cron jobs fail silently more often than you think. Heartbeat monitoring catches missed runs before they become disasters.

MJ
Marcus Johnson
September 25, 20256 min read3,450 views
Share
cronmonitoringdevopsheartbeat

The Silent Failure Problem

Cron jobs are the unsung heroes of your infrastructure. They handle:

  • Database backups
  • Report generation
  • Data synchronisation
  • Cache warming
  • Email queue processing

When they fail, nobody notices — until it's too late.

How Heartbeat Monitoring Works

Instead of actively checking your cron job, UptimeGuard waits for a heartbeat ping from your job:

# Add to the end of your cron job
curl -fsS --retry 3 https://hb.uptimeguard.com/YOUR_MONITOR_ID

# Full example: daily backup with heartbeat
0 2 * * * /usr/local/bin/backup.sh && curl -fsS https://hb.uptimeguard.com/abc123

If the ping doesn't arrive within the expected window, UptimeGuard alerts you.

Alert Conditions

  • Late: Job ran but took longer than expected
  • Missing: Job didn't run at all
  • Failed: Job ran but exited with an error (exit code != 0)

Best Practices

  1. Set grace periods — Allow 10–20% buffer for natural variation
  2. Monitor execution time — A job that takes 10x longer than usual is a warning sign
  3. Use exit codes — Send different heartbeat URLs for success vs failure
  4. Alert on the first miss — Don't wait for multiple failures

Common Pitfalls

  • Cron timezone mismatches (always use UTC)
  • Environment variables not loaded in cron context
  • Disk space exhaustion causing silent write failures
  • Log rotation deleting error evidence

Start monitoring your cron jobs for free with UptimeGuard.

Share
MJ

Written by

Marcus Johnson

SRE Lead at UptimeGuard. 10+ years in infrastructure and reliability.

Related articles