Alerts Page Defaults


#1

I’ve just run into a possible issue caused by a combination of Alerts page defaults (or at least my saved Alerts page filters) and alert de-duplication.

Namely, we had an alert fire on December 31st (which no one paid any attention to, partly because we’re still ramping up our usage of Opsgenie) and again earlier today. I noticed the alert had fired today because our Alertmanager routing (we use Prometheus) also sends emails whenever an alert fires. So I went to Opsgenie but couldn’t see any trace of the alert. I figured out a while later this was because the default time range filter on the Alerts page is “created during last week” (at least that’s how it is for me, I might have touched that in the past) and the alert was created more than 7 days ago, so there was nothing for me to see there.

Would it be possible for the default time range on the Alerts page to be “all time”? And I don’t mean just for me, but for all users (at least within my company). It doesn’t look like it would put too much load on Opsgenie, as only the first couple screens’ worth of alerts are loaded (with the rest loaded when scrolling to the bottom).

Ideally the default time range could be set globally by an admin and user changes to the time range would not be persisted at all, so users would not be able to shoot themselves in the foot (persisting time ranges or not could also be an option to be selected by the admin).

Failing that (and considering that such a change could not be made within days anyway) do you have any suggestion for how to work around this issue? All I could come up with was to edit all escalations and add some extra notification after 7 days. But that’s both too specific (user could set their default time range to last day instead of last week) and not visible enough (the 7 day notification may slip through the cracks and it’s a one time thing).

Thanks,
Alin.


#2

Hi Alin,

Thanks for the feedback. The default “Last Week” range is indeed set on purpose due to certain performance issues, but I’ll pass back your remarks to the product team.

However, I would like to mention that once you change it, your preference will be saved and it should automatically show “All Time” the next time you log in or use the saved search. Do you think that could work?

As for deduplication: make sure your Alertmanager is closing out alerts automatically in Opsgenie when they recover. This way, the new incoming data would have raised a new alert instead of being deduplicated into the original one.

Hope this helps - and apologies for the inconvenience you’ve experienced. I’m positive that with the right setup, you can minimize the risk of such scenario happening again!


#3

Hi Daniel,

And thanks a lot for the reply. Also happy to hear that you’re passing along the feedback.

Regarding the fact that changes in the Alerts page filters are persisted and used on the next visit, I had noticed that was happening. I do see a problem with this behavior too (I can find issues with anything, apparently): if I ever play with the range for some one-off search, my “all time” preference will be overwritten and I will likely overlook that detail until I run again into an issue with a “missing” alert.

As for Alertmanager closing alerts automatically, we have made a conscious decision to disable that behavior. An alert should (ideally) only fire if something is broken. Even if the alert stops firing, it doesn’t mean that the underlying problem has magically fixed itself (if it’s that kind of problem then it should probably not alert in the first place). So our policy is that if an alert fires, someone needs to take a look at it, figure out why it happened and take some action (whether fixing the underlying issue or adjusting the alert threshold).

That being said, we’ll make do with (1) keeping an eye on the Alerts page range and (2) being more consistent in addressing alerts long before they’ve been firing for a week. So it’s not a critical issue for us, but a fix would definitely help.

Cheers,
Alin.


#4

@free you can also integrate your opsgenie with a collaboration tool like teams or slack and pipe the alerts into a dedicated channel. Then you’ll never miss them again.