Monitoring / Metrics with Prometheus ==================================== We are using Zabbix 5.0 (lts) server with PostgreSQL database. Starting with manual configuration in a test vm and then automating it for for deployment, Ansible roles `zabbix-server` and `zabbix-agent` are to results of this PoC work. Please follow FAQ to see how to access staging deployment of zabbix. zabbix-server ------------- This role is ready at the base level but as the complexity of the monitoring increases, more work would be needed. At the current level, it - Installs needed packages for server - configure zabbix, apache and PostgreSQL configuration files - configures web UI - configures kerberos authentication While these basic things are good for POC, they are not ready to be in production until we have configured the following - add inventory files for groups and users and have zabbix-cli restore those in case of a fresh installation - Network config audit (see common challenges) zabbix-agent ------------ This role is ready to be used and existing templates are good to gather basic information. Though specific of what kind of common data would be collected from all agent nodes needs to be discussed widely and set in template. Other than common metrics, one can also export custom metrics using zabbix-sender (see FAQ). Common challenges ----------------- Lack of experience in selinux policies and network configuration, we are not very confident with those. A veteran sysadmin would be needed audit.