Setup Grafana OnCall @Home
Hello visitor! I hope you are fine.
It’s nice to meet you here.
Oh, you are passionate about alerting? Like me! What a coincidence!
OK, I knew, you are here because of the title…
I’m so happy to share with you my passion. What to say to start… hum… did you hear about Grafana OnCall?!
Will it continue like this for long?
It’s totally awesome, let’s talk about it. Come on, take a lemonade and be prepare for the fun!
Fun, really? Oh, God!
What is Grafana OnCall?
Grafana OnCall is an Open Source Software on-call management system developed by Grafana Labs, famous cloud company for products like Grafana and Loki. Their product are awesome, open-source (and we love open-source) and self-hostable.
If you don’t know Grafana’s products well, you could spend the rest of the day to pay attention to them. It will be worth!
An on-call management system starts to be useful when you have several alerts to manage, on-call cycles to handle and lot of people to organize! Grafana and Alert Manager are great to forward alert to a media, but it stays limited for a professionnal usage. Grafana on-call introduces on-call schedule, SMS/Phone alerts and complexe incident management.
General setup
Required material
- Kubernetes cluster (Suggestions: k3s or k3d)
- Grafana instance configured (Oncall’s chart provides its but I prefer to integrate oncall on my instance, don’t you?)
OnCall Installation
Useful links
Default
# Install grafana repository
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Default configuration
helm install \
--wait \
--set base_url=example.com \
--set grafana."grafana\.ini".server.domain=example.com \
release-oncall \
grafana/oncall
Suggested values
That’s probably the main part of why I am writing this article. I had to find a lot of specific configuration and there is not a lot of article/Github issue about it. Here it’s my final values file with the correct configuration.
base_url: oncall.example.org
base_url_protocol: https
env:
# https://community.grafana.com/t/grafana-oss-oncall-invalid-token-for-cloud/109543/8
# Author's note: It seems to be this URL for every european users. Maybe in your case it might be different.
# Anyway, when you create a token on Grafana Cloud, they give you an URL... that might be the correct one to use.
GRAFANA_CLOUD_ONCALL_API_URL: https://oncall-prod-eu-west-0.grafana.net/oncall
ui:
# Author's note: when enabled, it requires a Docker image which is not public
enabled: false # Seems to be a private project
oncall:
devMode: false
mirageCipherIV: random-16-bytes-string
ingress:
enabled: true
# Author's note: add your favorite ingress configuration
ingress-nginx:
enabled: false
externalGrafana:
# Author's note: I use my grafana instance. Don't forget to add the port when
# your use internal DNS!
url: http://my-grafana-service.monitoring:3000
database:
type: postgresql
# It was easier to configure it and nevertheless I think it's better to have the persistence
# layer in a secured postgres, thus we don't give much attention to our testing project's persitence, PVC, etc...
externalPostgresql:
host: db_host
port: 5432
db_name: db_name
user: db_user
password: db_password
# Use an existing secret for the database password
existingSecret:
# The key in the secret containing the database password
passwordKey:
# Extra options (see example below)
# Reference: https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS
options:
# options: >-
# sslmode=verify-full
# sslrootcert=/mnt/postgres-tls/ca.crt
# sslcert=/mnt/postgres-tls/client.crt
# sslkey=/mnt/postgres-tls/client.key
# Author's note: we use our own Grafana instance
grafana:
enabled: false
cert-manager:
enabled: false
mariadb:
enabled: false
# Author's note: we use our own PostgreSQL instance
postgresql:
enabled: false
redis:
enabled: true
architecture: standalone
auth:
enabled: true
master:
# Author's note: personnal opinion, I don't like stateful app
# so if it's not necessary, I let applications lose their data
# Please do as you want!
persistence:
enabled: false
# Author's note: Rabbitmq was pretty hard to correctly configure. It's my working configuration.
# tips: if you need to change something, ensure the PVC is destroy and recreate,
# to garantee blank and fresh configuration.
rabbitmq:
auth:
username: user
password: long-and-strong-password
rbac:
create: false
clustering:
enabled: false
serviceAccount:
create: true
name: rabbitmq-ha
automountServiceAccountToken: true
networkPolicy:
enabled: true
allowExternal: true
persistence:
enabled: true
size: 500Mi
Grafana configuration
Our Grafana instance requires to install the plugin. You can do it by environment variable. If it’s not already the case, I strongly recommend to add a PVC to your Grafana instance. Oncall configuration will require manual operation and we want to do it only once!
- You can specify a specific version of the pluging to use
vv1.9.22is expected. Don’t ask why.
GF_INSTALL_PLUGINS: "grafana-oncall-app vv1.9.22"
Starting from version 1.9.0, there is a necessary action to configure on-call.
rootandsecretare admin IDs for my grafana instance. Modify according yours.- In the payload
onCallApiUrlandgrafanaUrlshould be edited according your configuration. It’s in-cluster DNS address of services.
# With pord forward
curl -X POST 'http://root:secret@localhost:3000/api/plugins/grafana-oncall-app/settings' -H "Content-Type: application/json" -d '{"enabled":true, "jsonData":{"stackId":5, "orgId":100, "onCallApiUrl":"http://grafana-oncall-on-call-engine.monitoring:8080", "grafanaUrl":"http://my-grafana-service.monitoring:3000"}}'
curl -X POST 'http://root:secret@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/install'