Silence specific alerts for a longer time period

#1

Hello OpsGenie fellows,

I would like to get your opinion on a specific issue we have with our alerting right now:

Case 1:
We use Prometheus in our production environment for alerting. We have generous alerting rules (like HAProxy Backend down per microservice, HTTP Health Check failed per microservice etc.). Now when we create a new microservice in production, the service is integrated into Prometheus automatically. Therefore all our alerting rules apply to this new microservice too as soon as the microservice is fully prepared (meaning: all servers are set up and online). In some cases a newly prepared microservice might not be deployed, therefore the service is down forcing Prometheus to fire alerts (HTTP check failed, HAProxy backends down).

Case 2:
We do maintenance work on a database of a specific microservice. Therefore we want to create a silence for a specific alert that matches let’s say the service tag. Assuming our microservice is called payment, we want to silence all alerts that match the service tag “payment”.

Since we use OpsGenie for all our alerts (not only on call relevant alerts) we would need to create silences for alerts that match a specific pattern (like service tag) for a specific end date. We do know the “snooze” feature, however it’s not efficient for us to snooze it for max 1 week. Some snoozes need to be in place for more than 1 week actually. And we’re also not able to create a snooze for multiple alerts within one command.

I do know that we can reach this by creating silences directly in Prometheus. Since we are heavily using the OpsGenie Slackbot it would be nice to achieve this by setting it in OpsGenie directly.

Is there any way to achieve this with OpsGenie? Otherwise I would need to think about implementing it directly in the Alertmanager of Prometheus.

0 Likes

#2

Hi @niko!

It sounds like you’re interesting in leveraging maintenance within Opsgenie. With maintenance, you can setup a time-frame for a certain policy to be enabled, or a certain integration to be disabled.

So with this you would have a couple of options. Create a notification policy that will suppress or delay notifications for all alerts which meet certain conditions (i.e. tags contain payment).

Alternatively, you could choose to have the maintenance disable the integration that these alerts would be coming from.

I think using maintenance in Opsgenie to schedule a notification policy to be enabled that will suppress/delay notifications for certain alerts, or using maintenance to disable the particular integration that the alerts will be coming from will be a good solution for your use-cases. Let me know if you have anymore questions!

Thanks,
Samir

0 Likes

#3

Hi @Samir,

yes, you are right. The notification policy sounds promising. I will take a deeper look on that.
Thanks for the hint!

0 Likes