Linux Management and Monitoring: Zabbix + Puppet ?!?

I’ve spent a good part of this week evaluating various solutions for monitoring and managing Linux servers.  There are many solutions that meet some of the needs, but I have not found a single product that does everything I need.  Without going into significant detail, here are my needs:

  • Simple service up/down monitoring and alerting
  • Detailed metrics for disk, CPU, memory, network, etc
  • Storage and graphing of historical data
  • Flexible creation of custom monitors, triggers, etc.
  • Significant pre-configured monitors, triggers, etc.
  • Deployment/upgrade/removal of packages
  • Configuration management
  • Inventory of hardware/software
  • Rich access control to satisfy various distributed administration and audit requirements
  • *ALL* functionality exposed by web interface
  • Inexpensive, preferably open source
  • Able to scale to several hundred, maybe thousands of Linux servers – and possibly even Windows servers.
  • Should be developed in a language I know so I can modify myself – (C, Perl, Python, Java)
  • Should run on Debian, though it does not need to be in the repo.

Candidate solutions:

  • Nagios
  • OCS-Inventory
  • Hyperic
  • Puppet
  • Zabbix

Conclusion:

Nothing definitive yet, but I’m narrowing in on a combination of Zabbix and Puppet.  I am still looking for something to automate the collection of inventory data, but I think this could be done using the Zabbix API to populate the Zabbix inventory (which is otherwise a manual process).

LazyWeb – any opinions?