I’m hoping to find something that:

  • has a nice dashboard
  • is quick and simple to install
  • is very lightweight and unobtrusive
  • can send alerts via http request
  • Netdata is exactly what you’re looking for. It’s basically an all in one monitoring and and alerting suite that collects and analyzes data, and provides a gorgeous web dashboard for you to view.

    You can also manually replicate this using Prometheus, Grafana and other tools, but that requires a much bigger effort to set up.

    • ikidd@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 days ago

      I think they went to 5 nodes max on the free version as of the last patch. That’s damn near useless.

      • ipkpjersi@lemmy.ml
        link
        fedilink
        arrow-up
        0
        ·
        9 days ago

        Is that just for the centralized dashboard portion? I tend to use each instance of it standalone, and primarily for the email alerts.

        • ikidd@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 days ago

          I believe so. I imagine the next stage of the enshittification will be to force those standalones to register with a portal account.

  • ocean@lemmy.selfhostcat.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 days ago

    I just see if it works when I need it. If I’m at home it works. If I’m at work it may work. If I’ve left to travel it’s 95% definitely down and cannot be fixed. This works well!

  • loganb@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    10 days ago

    I personally use CheckMK.

    • Offer a free “Raw” version.
    • Can be deployed with docker.
    • OSS

    One thing is that it can be a lot to take in at first and took me a while to get used to it.

    • corsicanguppy@lemmy.ca
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 days ago

      CheckMk user here via omd.

      I’m looking for something else after the upgrade.

      1. Black interface isn’t pretty for me and the old interface was “meh too hard so we ditched it”.

      2. One half of the project split has a shit supply chain and just doesn’t meet the bar for upgrade requirements.

      3. The other half of the project split is a mess to config in an automated desired-state setup. It’s all edge-triggered manual bullshit. NO. ENOUGH.

      I miss 1.2 .

  • Phoenixz@lemmy.ca
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    10 days ago

    We just recently started using zabbix. Open source and has a web interface to get a central view that can be accessed from wherever we allow it.

    So far it’s been great but er have had little time and so far have used only 1% of what it can do

    Still, I’d recommend it. Super easy to install, seems light weight, has clients for any os you’d need, can send out alerts (we currently use pushover for that)

  • ddh@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 days ago

    I use my family. It has a simple volume based alert for when services are offline.

    • fmstrat@lemmy.nowsci.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 days ago

      Until the UPS battery gets low and it beeps, and they look for a way to turn it off vs calling you. Yup.

    • vfsh@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      0
      ·
      9 days ago

      It’ll even automatically configured variable alert volumes corresponding to the importance of the service!

  • utopiah@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    9 days ago

    send alerts via http request

    On this specifically you might want to check ntfy as it’s quite easy to setup and can give you notifications on pretty much any device (including iOS) via your own infrastructure all the way down to basics e.g. SSE. That mean you can subscribe to a topic, e.g. servers per physical location, alert level, etc and only get the ones you need.

    • utopiah@lemmy.ml
      link
      fedilink
      arrow-up
      0
      ·
      9 days ago

      Node exporter, Prometheus and grafana

      Otherwise much heavier but that’s also what I use.

  • tath@social.tath.link
    link
    fedilink
    arrow-up
    0
    ·
    10 days ago

    Zabbix is pretty quick and easy. Many different services built in for sending notifications, along with your own custom (including webhooks). Fully customizable dashboard as well so you can add whatever you want/need at a glance.

  • sgh@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    10 days ago

    While I use LibreNMS as it uses SNMP for monitoring (which is pretty much available everywhere), I don’t believe it has http alerts, but I know for a fact that it can send Telegram messages.

  • RegalPotoo@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 days ago
    • Base ansible role installs Prometheus node exporter, configured with the text file collector
    • VM automations push DNS records so that the Prometheus dns-sd automatically discovers them
    • Ansible roles for add Cron jobs that generate metrics for specific systems and dump them for the text file collector
    • Grafana for dashboards
    • Karma as a UI in front of Prometheus alert manager
    • tetris11@lemmy.ml
      link
      fedilink
      arrow-up
      0
      ·
      10 days ago

      Cron jobs that generate metrics for specific systems and dump them for the text file collector

      Details please

      • RegalPotoo@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        10 days ago
        • https://github.com/prometheus/node_exporter?tab=readme-ov-file#textfile-collector - which makes node exporter watch a specific directory for files that contain metrics, then re-export them back to the central Prometheus server
        • Some systems have their own metrics endpoints - instead of getting Prometheus to scrape these directly I set up a Cron job to curl these into files for node exporter - this means I don’t need extra config in Prometheus to find the endpoints, and don’t need to mess with firewall rules
        • Other systems don’t directly expose metrics in a format Prometheus can use - in this case I will write/find a script that can do the conversation, then either set it up to write the metrics file directly and run it on a Cron, or run it as a service and another Cron job to do the scrape