Kubernetes CronJob Monitoring from Sandglass: practical guidance for monitoring the business result of a CronJob instead of only watching Kubernetes object state.
This guide focuses on monitoring the business result of a CronJob instead of only watching Kubernetes object state. The goal is to make the operating decision clear before a stressful incident forces the team to improvise.
Call a heartbeat from the container after the job completes and pair it with cluster events for debugging when the heartbeat is late. Sandglass supports the continuous side of this work with checks, incidents, alert routing, and public status visibility.
Kubernetes status can show scheduling and pod history, but an external heartbeat proves that the job reached its success path.
Decide which failures in this topic actually reach customers before adding any monitoring.
Match each risk to a single HTTP, content, TCP, SSL certificate, or heartbeat check instead of stacking duplicates.
Give each alert one owner and one destination — email, a Slack webhook, or a generic webhook.
Revisit intervals, thresholds, and ownership once a real incident shows what was missing.