Dead Man's Switch Monitoring from Sandglass: practical guidance for alerting when an expected success signal never arrives.
This guide focuses on alerting when an expected success signal never arrives. The goal is to make the operating decision clear before a stressful incident forces the team to improvise.
Make the job call a heartbeat URL after successful completion and configure Sandglass to alert when the next expected signal is late. Sandglass supports the continuous side of this work with checks, incidents, alert routing, and public status visibility.
A heartbeat at job start only proves the job began. Put the signal after the critical work so failures are visible.
Decide which failures in this topic actually reach customers before adding any monitoring.
Match each risk to a single HTTP, content, TCP, SSL certificate, or heartbeat check instead of stacking duplicates.
Give each alert one owner and one destination — email, a Slack webhook, or a generic webhook.
Revisit intervals, thresholds, and ownership once a real incident shows what was missing.