π Prometheus How-To Guides
Let's get your metrics scraped and your alerts firing!
(Looking for advanced PromQL magic? See the official Prometheus documentation.)
π§² Ensuring Your Service is Scrapedβ
We keep it simple: we use standard Kubernetes annotations to tell Prometheus exactly what to scrape.
To ensure your metrics make it to the dashboard, your pods (not your deployments!) must live in a monitored namespace and wear these three annotations:
prometheus.io/scrapeprometheus.io/portprometheus.io/path
π¨ Managing Prometheus Alertsβ
We manage our Prometheus alerts centrally as code! These rules use PromQL expressions to constantly evaluate conditions over time.
Here is what a rule looks like:
Alerts Config
alerting_rules.yml:
groups:
- name: blackbox_alerts
rules:
- alert: CertExpiration
expr: ((probe_ssl_earliest_cert_expiry{job="blackbox"} - time()) / 3600 / 24 < 30)
for: 90s
labels:
severity: 'critical'
annotations:
description: "A certificate ({{ $labels.instance }}) is about to expire in 30 days!"
summary: "A certificate ({{ $labels.instance }}) is about to expire in 30 days!"
Want to add or change an alert? No problem! Just open a Pull Request right here with your shiny new rule, and ping the Infra team to review and deploy it! π