Hello OpsGenie fellows,
I would like to get your opinion on a specific issue we have with our alerting right now:
We use Prometheus in our production environment for alerting. We have generous alerting rules (like HAProxy Backend down per microservice, HTTP Health Check failed per microservice etc.). Now when we create a new microservice in production, the service is integrated into Prometheus automatically. Therefore all our alerting rules apply to this new microservice too as soon as the microservice is fully prepared (meaning: all servers are set up and online). In some cases a newly prepared microservice might not be deployed, therefore the service is down forcing Prometheus to fire alerts (HTTP check failed, HAProxy backends down).
We do maintenance work on a database of a specific microservice. Therefore we want to create a silence for a specific alert that matches let’s say the service tag. Assuming our microservice is called payment, we want to silence all alerts that match the service tag “payment”.
Since we use OpsGenie for all our alerts (not only on call relevant alerts) we would need to create silences for alerts that match a specific pattern (like service tag) for a specific end date. We do know the “snooze” feature, however it’s not efficient for us to snooze it for max 1 week. Some snoozes need to be in place for more than 1 week actually. And we’re also not able to create a snooze for multiple alerts within one command.
I do know that we can reach this by creating silences directly in Prometheus. Since we are heavily using the OpsGenie Slackbot it would be nice to achieve this by setting it in OpsGenie directly.
Is there any way to achieve this with OpsGenie? Otherwise I would need to think about implementing it directly in the Alertmanager of Prometheus.