Safe monitoring with Zabbix
February 20, 2017New features in Zabbix 3.2 : "Event Tags"
In these first posts, we are going to analyze some of the interesting new features that the new version of Zabbix offers. We have chosen for this first post, one of the most outstanding, called "Event tags", or "event tags".
Zabbix's approach to incident management has so far been very much oriented to specific "hosts", to servers and to services within servers and not globally to "services". This perspective, coming from the original nature of Zabbix configuration (which is still in force) applies very well to the management of networks and servers in a single location, which corresponds to a very common use of the system.
But today, with the increasingly widespread use of containers and especially architectures oriented to micro-services, another different approach is needed to make Zabbix a suitable tool for both the teams that manage the infrastructure for as those who manage the services (which, increasingly, are the same teams). To do this, Zabbix people have redesigned the problem screen and added the possibility of, in each "trigger" a specific label is indicated.
These labels or tags are constructed by means of a label name, which can have a value associated with it, allowing us to give these labels an extra level of classification, for example: WebServices, WebServices:Muutech-Login, WebServices:Muutech-Web and others that we can think of once we start working with them as whose responsibility it is, if it affects clients, etc. (Team:Ops, Team:Dev, Customer:External).
For the example, we will want Zabbix to warn us every time there is a 404 error, for example:
Zabbix's approach to incident management has so far been very much oriented to specific "hosts", to servers and to services within servers and not globally to "services". This perspective, coming from the original nature of Zabbix configuration (which is still in force) applies very well to the management of networks and servers in a single location, which corresponds to a very common use of the system.
But today, with the increasingly widespread use of containers and especially architectures oriented to micro-services, another different approach is needed to make Zabbix a suitable tool for both the teams that manage the infrastructure for as those who manage the services (which, increasingly, are the same teams). To do this, Zabbix people have redesigned the problem screen and added the possibility of, in each "trigger" a specific label is indicated.
These labels or tags are constructed by means of a label name, which can have a value associated with it, allowing us to give these labels an extra level of classification, for example: WebServices, WebServices:Muutech-Login, WebServices:Muutech-Web and others that we can think of once we start working with them as whose responsibility it is, if it affects clients, etc. (Team:Ops, Team:Dev, Customer:External).
Example of use and configuration of triggers
The example of use that we are going to put is in the most evident when we have a single host whose services affect several of our applications or micro-services. We could, for example, monitor specific data from the databases of each micro-service, the disk consumption of its assigned folders, etc, but in this example, we are going to use that place where many times things from different micro-services or even other systems end up, in a way that is difficult to see: the LOG. In the example, we are going to use the log of an Apache serving several web micro-services (two in particular that we will call Muutech-Login and Muutech-Web). Although it would be advisable to separate the log of each micro-service in different files, to show all the benefits of these Zabbix features, let's make them write on the general log.For the example, we will want Zabbix to warn us every time there is a 404 error, for example:
192.168.56.1 - - [29/Jan/2017:20:46:57 +0100] "GET /muutech-login/no-existe HTTP/1.1" 404 221 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36"We configure the item in a proper way to read from the log the GET requests, as well as the Zabbix agent to support the active mode and, important, to have read rights on the log. When configuring the trigger, the key to indicate which label to put when this problem is solved is in the section "Tags". It is possible to use the following macros to fill in the tags: {ITEM.VALUE} and {ITEM.LASTVALUE}, as well as user macros. This can be tremendously useful to auto-generate the tags, combining it with the new functions for regsub and iregsub macros. In the Zabbix manual you have examples of how to configure them. Try it:
If you notice what this trigger does is to jump if the last received GET (read from the log) has a 404, so the problem will be generated every time a 404 appears and will be eliminated every time there is a successful GET.
Look at the label we have put, generic "WebServices" and in value:
Let's try it.
For our example, we have tried to access /muutech-login/no-existence and, as we expected, an alarm has been triggered, which we can see correctly labelled on the screen:
Look at the label we have put, generic "WebServices" and in value:
{{ITEM.VALUE}.regsub(«GET \/([A-Za-z\-]+)\/» ,\1)}to take the folder (our rudimentary equivalent to the VirtualHost of the micro-service). With this, every time the 404 alarm is triggered, the label will tell us which micro-service has been triggered.
Let's try it.
Problem display
In Monitoring -> Problems we can see the tags associated to each triggered, being the maximum of tags shown three. In case there are more than three passing the mouse over the three points that would appear we would see the rest.For our example, we have tried to access /muutech-login/no-existence and, as we expected, an alarm has been triggered, which we can see correctly labelled on the screen:
As you see the labels we must use, they must be as minimalist as possible since the display of values is not good. But, we do have the great advantage of being able to filter through these tags when visualizing the problems, which will allow us to generate customized screens from these tags, monitoring our micro-services properly... or whatever we can think of to tag based on our architecture.
Usage of tags in actions
When configuring the actions (for example in the notifications) we will be able to use the labels as filtering, which allows a more granularity point when doing automated actions or who to warn.Besides, as in the report we will be able to use the macro {EVENT.TAGS} (and for the recovery ones {EVENT.RECOVERY.TAGS}, it will be very useful for us to label it also in our email.
It is also interesting to comment that these tags can be used in conjunction with the event correlation system, that we will see in future posts, as well as the problem closure (both new in this version 3.2)
Advanced uses
After this small explanation of basic use, you can try to incorporate these tags into your templates or even use them in those templates that use LLD (low-level-discovery), being able to use the LLD's own macros to fill in the values of the tags.It is also interesting to comment that these tags can be used in conjunction with the event correlation system, that we will see in future posts, as well as the problem closure (both new in this version 3.2)
Impact on performance
Of course, this functionality is not free at the performance level of our Zabbix. As warned by Zabbix's own documentation and logic the following points will be affected:- Event processing in general
- Creation, updating and deletion of triggers, especially those that inherit through templates and modify these
- Configuration cache synchronization
- Increase of the consumed disk space of each event in a way that is also difficult to predict, since it will vary a lot depending on the size and number of tags.
CEO & MANAGING DIRECTOR
Expert in IT monitoring, systems and networks.
Minerva is our enterprise-grade monitoring platform based on Zabbix and Grafana.
We help you monitor your network equipment, communications and systems!
Subscribe to our Newsletter