Service monitor, test OK if no ping

Hello community,

I'm regarding about testing a machine supposed to stay close, based on ping monitor.

I tried to setup with no success. I changed critical options for both settings. I put one by one the setting but same result.

        

I don't understand because it's clearly the same result that I caught with my critical triggers, the only thing is possibly the decimal variation, I suspect 100.0 is not exactly the same as 100 or more believable, I think the ping is not going on triggers, because it's fail before, so this is why the caption give me a CRIT.

So if anyone have a solution, with a different test, I'm all open.

 

Thanks

  • Hi there.

    I completely understand your confusion here. This is a "special" service monitor. 100% loss is always considered critical. This service monitor was designed as the host check (checks if something is up or down) for most entities in Uptime. As such, the intention was that you may have some loss and some latency, but we wouldn't consider the device down. Also this is why there is the "Number to send" setting. By default we try to ping 5 times, if there is 100% loss, that is critical. When used as the host check, and it goes into a critical state, we stop monitoring everything else on the entity and those other service monitors that would normally run against it will go into an unknown state as a result.

    Are you just trying to alert on when the device comes online, or what is your intent for monitoring, perhaps I can help with a different solution?

    Thanks,

    Robert
  • In reply to Robert Vandervoort:

    Hi,

    And thanks Robert for the answer.

    This is exactly that I concluded no ping means critical for uptime.

    Yes I'm trying to alert when the device comes online, for security purpose it must be offline the most of the time, so when it goes online outside controlled scheduled time that's means we (my IT team) have to urgently watch on it. So yes let me know any ideas will be welcome.
    The inside for this, I like the feature whom is in charge to ignore next monitors, when the default ping check is critical all others are unknown, this prevents spam alerts.

    Thanks
  • In reply to adminNotarius:

    I think the easiest way would be to use the custom script monitor. When you add a new service monitor, go to the very bottom under "advanced and custom" section.  Select "Custom" and continue.

    Then set it against that element you're trying to ensure stays down. Give it a name, description, and then in the "Script Name" field put the path to your uptime\scripts folder like so 

    /program files/uptime software/uptime/scripts/pingcheck.bat

    Now we need to drop a batch file in that folder, called, pingcheck.bat. Feel free of course to put this stuff wherever you see fit and call it whatever you iike, just make sure it all matches up. Contents of the bath file are:

    ping -n 1 %1

    Then, back in the service monitor you are creating, for the "Arguments" field, put:

    %UPTIME_HOSTNAME%

    Lastly, in the critical status section, choose

    Critical [does not contain]  Destination host unreachable

    like so:

    Now save it... When you get back to the parent page, hit test. What this is going to do is call the ping command and try a single time to reach the host this service monitor is assigned to. It will output the standard ping output and we are checking to be sure it says the host is unreachable. If it does NOT say that, it will trip the critical state.

    This is a good intro actually to creating custom script monitors. You can use them to accomplish all kinds of things and creating them really is this easy. Note the format of the script path, that's the only tricky part here! don't be tempted to use something like C:\program files.... it will not work. 

    You may find these links helpful too. First, a list of all the variables we can use, like %UPTIME_HOSTNAME%

    http://docs.uptimesoftware.com/display/UT/Alert+Profile+and+Action+Profile+Variables

    And of course the doc for creating custom monitors.

    http://docs.uptimesoftware.com/display/UT/Creating+Custom+Service+Monitors+in+Uptime+Infrastructure+Monitor

    Let us know if there is anything else we can help with!

    Robert

  • In reply to Robert Vandervoort:

    It works perfectly ! Test run as expected. Really thanks Robert.

    I just have a little concern, now the offline monitor is doing what it's supposed to do, but the default ping check is overwriting all sub-monitors including him.
    I'm checking to bypass the host check, to be able to have this one green and others monitors keep the dependency with the host check, default ping and stay UNKNOWN.
    Or different perspective, change the default host check for the offline monitor check, but if it's goes this way, the default host check will be in OK status so all sub-monitors will be checked but will return CRIT. So not sure if the correct way to have all my monitor in green and UNKNOWN for others.
    Or third alternative, with a tricky idea, put the offline monitor as host check, if this one is OK all others need to not be checked and keep the UNKNOWN status, and if is not OK, means CRIT, all others can be checked.
    But not really tricky because I can feel it's possible with custom script linked to a .bat
  • In reply to adminNotarius:

    Well, you might just consider leaving it as the default setup without the new monitor, and not alerting on crit for the ping, but instead only alert on recovery of the ping. Then whenever the element in question comes online the other monitors will start working and you'll get an email that the element came back online. Now, the new monitor we just created will be critical, and that would be responsible for continuing to alert you that the element is up, getting your attention.

    Makes sense?