June 2012 Release

Terminology Change: Agents are now Collectors

In this release, we’ve changed the name from “Agents” to “Collectors”. They have the same functionality as always – but some people thought that “Agents” had to be installed on every system in order to monitor them.  In order to make clear that our deployment is very flexible (you can do one collector per host, or one collector per datacenter; or two per datacenter with failover) we’ve just renamed them in the UI and our documentation.

New Features

  • One of the bigger changes in this release is one you may not notice – we’ve split off the in-memory cache of reported data into it’s own system.
    What this means is that when we do restart the main application already reported data that has not been flushed to disk will not be lost; the application will restart much quicker; and it will also be simpler and more stable.
  • Users with limited roles can now be given full authority over a group of hosts – they can add, delete and modify hosts within the group. There is a new privilege “Manage hosts/subgroups in this group”  for the hosts section of a role to delegate this power.

Improvements

  • Alerting and Alerts:
    • Introduction of the Alert Clear Interval.  Similar to the Alert Trigger Interval, which specifies how many consecutive polls an alert condition must exist for before the alert is triggered, the Alert Clear Transition Interval allows the specification of how long the reported data must not satisfy an alert threshold, before the alert is declared cleared. This helps prevent alert noise for data that is oscillating around a threshold. Initially, all Alert Clear Intervals are set to zero, to maintain the existing behavior, but this can be tuned per alert.
    • The subject line of acknowledgement messages has changed to make the fact this is an acknowledgement, not a new alert, more evident on short screens.  Instead of “Alert host/datasource/datapoint is acknowledged by user” the response will now be: “Ack of host/datasource/datapoint by user”
    • If an alert has been acknowledged, alert clear notifications are now only sent to the person that acknowledged the alert, rather than all people that had received the alert. (They will also have received the Acknowledgement notification.)
    • The template for the  ‘Alert cleared’ message is now editable along with other message templates (under Settings…Alert Settings..Other Alert Settings..Default Alert Templates)
    • Added a new token ##LIMITEDMESSAGE## to be used in event alert templates – this is only the first 10 words of the event – useful for voice alerting, where reading the entire event would be too long.
    • You can now select alert rules or alert chains as filters on the alert tab filters – allowing you to see alerts that were routed to you (or others.)
    • It is now possible to use the browser function to ‘open in new tab’  for the hosts in the alert tab. A user with lots of alerts can now right-click and open a new tab for many alerts and then work through the tabs afterwards.
  • Collectors:
    • An attempt to install and run the same collector binary on a different system will, if the original collector has been heard from within the previous 5 minutes,  automatically shut down the second collector, to prevent duplicate collectors of the same ID.
    • Collectors now support silent and scripted installation.
    • The alert notification chain for escalating alerts about collectors being down is now configurable on a per collector basis.
    • Performance improvements reducing the number of connections required for collectors communicating through proxy servers.
  • Reporting
    • The description field of reports now shows on the reports.
    • Reports can now be assigned to report folders by drag and drop.
    • SLA reports now support multiple metrics on a single report
    • New Report: Host Group Inventory report, which shows host count for each group (and optionally subgroup) and other properties for each group.
  • Dashboards now support manually refreshing a widget, and the ability to automatically rotate among a set of dashboards.
  • Improvements to Remote Session functionality – the SSH or RDP window is now larger by default; authentication screens are consolidated into one; better error handling.
  • The Audit Log now tracks Remote session connection attempts, as well as more details about changing of alerting and thresholds on instances.
  • When downloading a host graph as a PNG, it is now uniquely named for the host and instance, to avoid overwriting other downloaded graphs.
  • Hosts and groups are now clickable on Google Map overlays
  • The Settings tab is now hidden if the logged in user has no rights to settings.
  • If, during a network scan, the discovered hosts respond to the top level SNMP community (set at the top node of the hosts tree), then the result of the system.sysname OID will be used to identify the host when adding in to monitoring.
  • Change the Auto Properties detection (which determines the kind of host – NetApp, Linux, Windows, etc) to not run periodically after the kind of host has been discovered, to reduce log messages in some environments
  • The date format in the forms to create/edit Scheduled Downtime now spell out the month, rather than using the month number, to avoid international ordering issues (“1/12/2012” can be interpreted as Jan 12th or December 1st, depending on convention, so it is now explicitly “Jan 12, 2012”.)
  • When a proxy is configured for the collector to communicate to the LogicMonitor service, the proxy credentials are now stored encrpyted in the agent.conf file.
  • Various Windows collector installer and wizard improvements
  • Webhooks now have additional tokens available, to pass in contact info specific for the admins
  • Webhooks now allow the ability to add parameters to end of the URL that is called, specified in the URL, not just in the webhook template.
  • Syslog monitoring will now assume that syslog messages received by the collector from a source of 127.0.0.1 came from the collector.

Bugs Corrected

  • In the Smartgraph view of custom dashboard graphs with many instances, it was not possible to scroll to the bottom of the graph to get to the Download data button. This is corrected.
  • When clicking a host group name for a cluster alert on the alerts tab, an error was displayed instead of navigating to the relevant host group. This is corrected.
  • Corrected an issue where some properties that should have been protected were visible in the output of certain debug commands
  • Corrected an issue in the autocomplete of the host selection in various reports.
  • Corrected an issue that caused the Alert Filter to appear to be cleared after collapsing/expand the filter.
  • Corrected an issue that caused a “Null” pop up error when a host group that was displayed on a map widget was deleted
  • Some alert clear messages were not being sent in cases where error or critical alerts had an escalation destination, but warning alerts for the same datapoint did not. This has been corrected.
  • Corrected a typo in the formula graphing ESX datastore usage
  • Corrected a bug that could cause a delay in when a change to event source monitoring took effect on a collector
  • Corrected an issue where repeated collector restarts could delay alerts detected by that agent
  • Corrected an issue where the Services checks did not always report the correct failure mode when DNS could not be resolved for the site being checked.
  • Corrected an issue that caused alerts to be omitted from the Alert Report, if the specific alert had been disabled subsequent to the triggered alert.
  • A voice call recipient defined in a recipient group, as opposed to directly in the chain, was not have voice calls escalated to them.
  • Corrected an issue that was causing data that was not collected (due to timeouts or other reasons) to clear existing alerts on those datapoints. NaN should not clear alerts, and does not.
  • Corrected an issue that was causing instances to be removed if a timeout occurred not during the discovery of the instance, but in the application of filters to the discovered set of instances.
  • External groovy scripts were not being timed out correctly if they did not exit cleanly
  • A bug prevented downloading the csv data from the smart graph dialog for a service graph
  • Fixed an issue of loading the google maps API over HTTP, not HTTPS, which caused insecure content warnings
  • Corrected an issue whereby alerts were still sent for hosts not in any groups, even though Alert Enable was cleared on the top level container in the Hosts tab.
  • Corrected an issue with EC2 network discovery via the Amazon API preventing all dynamic EC2 hosts from being discovered.
  • HTTP Active Discovery was not using defined authentication credentials during discovery
  • Corrected an issue where a failed response from an ESX or VSphere server was being reported as a -1 value, instead of NaN. This was causing some incorrect alerts.

Known Issues

  • The UI currently allows the addition of SMS contacts to escalation chains, but this is not recommended to use yet. While SMS messages will be delivered, acknowledgement via SMS is not yet functional, and SMS messages will be billed as per voice calls. We will remove the sms contact method in a point release until we correct the outstanding issues.
  • The privilege to manage hosts in a group does not yet allow the ability to modify thresholds on the managed hosts.

In this Article: