I’ve been setting up and testing prometheus and grafana for about a week now, since that seems to be the universally accepted solution for self-hosted monitoring. But I’m starting to question why it is so accepted. On top of prometheus not seeming useful on it’s own (needing grafana to visualize and alertmanager for alerts) it feels like with each thing i want to monitor I have to spin up another docker container to export/gather the data. There are other options like LibreNMS that seems to have all that built into one container. So what does this Prometheus/Grafana stack have that other monitoring services don’t? Is it really worth having to set up each of these specialized exporters and dashboards? Or am I mistaken that it’s the main solution everyone uses? Are you using something different for monitoring?
one of the reasons you see these as separate is because of the amount of modularity you get with grafana, you don’t always use it with prometheus, sometimes with ELK sometimes Influx and Telegraf. If you intend to set it up outside of the “typical” you start to really appreciate one piece doing one thing, and doing it well.
I use Prometheus and grafana at home because i use it at work, so I’m familiar with it
'Cos the bros don’t deem Icinga cool enough.
standardization is amazing, one data source, and one graphing engine, now you can overlay different metrics from different systems and have very customized dashboards.
Grafana and Prometheus are great if you have numeric things you want to monitor. CPU usage, RAM, disks, throughput, etc. You can then do lots of things with these numbers, mainly compare them to your other systems or alert when they go out of bounds.
However, I very much prefer Zabbix for my home network monitoring as this is not so fixated on numbers but can easily work with e.g. error messages in logfiles and alert on those. Or I can regularly check a website for new firmware versions and alert once the latest version changes. There are also lots of ready-to-use templates available from their Community Hub.
prometheus and grafana … seems to be the universally accepted solution for self-hosted monitoring
Not exactly. There are many ways to do this. Most of us just use this solution because its easily scalable, highly documented and what we are probably already doing at work.
all built into one container
It’s nice to separate data sources from the dashboards and alerting platforms. It’s scalable and extremely light weight and gives you more options.
On top of prometheus not seeming useful on its own …
Yeah, that’s just not true. Maybe for you, in your use case. Installing a Prometheus node exporter gives you an easily accessible end point with JSON data that can be used however you like. Modularity is a good thing. Being able to swap parts in and out with other parts is a good thing.
If you haven’t figured it out yet, there is not an exact correct answer here, use what fits your needs. While I have a dash board setup in grafana, it’s not my main use case. Since the data is available from all the node-exporters on all my hardware, I wrote up my own alerting scripts and automations using python.
That’s the beauty of self hosting.
I use InfluxDB plus Graphana
I was asking myself the same. As everyone talk about these I used them until I discovered ChekMK, and others. Now I’m no longer using Grafana and Prometheus…
Separate components that do one thing and only that thing and does it well are good. Extra containers are basically free.
- The exporters provide the metrics. They can be standalone executables like the node exporter, can also be included in apps themselves easily since it’s just HTTP. It’s trivial to add metrics to just about anything without needing extra ports. Its protocol is also easier and more efficient than SNMP.
- Prometheus scrapes those metrics and stores it into its database. In other apps that’d be the role things like PostgreSQL have: you don’t really use it directly, but it’s no less important.
- Grafana is the frontend you slap in front of Prometheus to actually display your metrics.
- Alertmanager looks at the metrics and sends alerts. It’s separate because if your Prometheus box goes down, how are you gonna be alerted of that?
All 4 of those can be swapped with something else equivalent and it all still works. Don’t like the UI? Replace Grafana. Don’t like Prometheus? There’s VictoriaMetrics and InfluxDB
It looks silly on a small scale, but it scales up very well. Couple hundred VMs per Prometheus install, node exporters on every VM and a single Grafana cluster to visualize the data for the whole infrastructure at once.
That makes it all well liked in enterprise which means there are exporters for damn near anything (even the Lemmy server has a built-in exporter I can scrape with Prometheus), which in turn makes it the easy solution for self-hosters too, and here we are.
I feel like it’s easier to set up than some of the all in one solutions I’ve used previously, despite being several components. They’re all components that basically just work out of the box.
I’ve been using Zabbix for years now. Does what I need it to do.