Uptime Infrastructure Monitor

Uptime Infrastructure Monitor (UIM) is an enterprise tool that provides a unified view for Information Technology systems monitoring. Although comprehensive and scalable, UIM is also easy to use and deploys in minutes. Through proactive, comprehensive, and integrated monitoring, reporting, and alerting, UIM gives complete visibility and control over the availability, performance, and capacity of Information Technology environments -- from applications to servers and virtual machines, to network devices. UIM also monitors the performance of a range of database management systems: Microsoft SQL Server, Oracle, MySQL, MariaDB, PostgreSQL, IBM DB2, and Sybase.

This blog post discusses the service monitors and the advanced monitor plug-ins for database performance monitoring.

Service Monitors

Service Monitors are processes within UIM that regularly check the performance and availability of services in your environment. If the monitor detects a problem, UIM issues an alert. The different types of service monitors that are available in UIM are agent monitors, Microsoft Windows monitors, virtual machine (VM) monitors, database monitors, application monitors, network service monitors, and advanced monitors.

Advanced Monitors are custom scripts and programs. Examples of advanced monitors are available as downloadable plug-ins. The advanced monitors do not require an installed agent on monitored systems.

Microsoft SQL Server

The Basic Checks Monitor determines whether an instance is listening on a server’s ports, determines whether an instance can process queries, and checks for values in any base and computed tables. The monitor runs queries against an instance, matches for user-specified regular expressions, and reports and alerts on the response and response time.

The Query Monitor is similar to the Basic Checks monitor. The monitor runs queries against an instance, and checks the response and the response time. For a response that is numeric or the number of returned rows, it uses an operator such ‘greater than’ and ‘less than’ to check the output. For a response that is a text string, it uses an operator such as ’contains’ and ’does not contain’ to check the output. If the response does not match the expected outcome, then it sets a warning or critical state for reporting and alerting.

The Advanced Metrics Monitor collects information on the availability and performance of individual databases. Aggregate performance metrics per instance or capture separately different performance metrics for each database. The monitor relies on agent or Windows Management Instrumentation (WMI) connections to access the Windows counters that provide the metrics. To monitor a virtual machine (VM) element, it needs to use agent or WMI collection. The monitor captures these metrics: Lock Wait / Sec., Lock Requests / Sec., Average Lock Wait Time, User Connections, Transactions / Sec., Data File Size / KB, Total Latch Wait Time, Latch Waits / Sec., Average Latch Wait Time, Maximum Workspace Memory, Connection Memory, SQL Cache Memory, Total Server Memory, and Response Time.

The Tablespace Check Monitor evaluates the size of data files within databases. The monitor gathers information from all databases across all instances on a system and aggregates this information in its metrics. The monitor also reports and alerts whether any of the data files in a filegroup or any log file in any database on an instance exceeds user-specified warning and critical thresholds.

For cluster and Always On Availability Groups monitoring, refer to the community forum post at http://community.idera.com/it-management/uptime-infrastructure-monitor/f/312/t/23447.

For job status monitoring, refer to the community forum post at http://community.idera.com/it-management/uptime-infrastructure-monitor/f/309/t/16417.

Oracle

The Basic Checks Monitor determines whether a host server is available, determines whether an instance is listening on a server’s ports, determines whether it can log into an instance, and evaluates a response based on a SQL script that runs against an instance or a database. For the executed SQL script, the monitor measures the response time, and matches for the response text using user-specified regular expressions for reporting and alerting.

The Query Monitor is similar to the Basic Checks monitor. The monitor runs queries against an instance, and checks the response and the response time. For a response that is numeric or the number of returned rows, it uses an operator such as ‘greater than’ and ‘less than’ to check the output. For a response that is a text string, it uses operators such as ‘contains’ and ‘does not contain’ to check the output. If the output does not match the expected outcome, then it creates a warning or critical state for reporting and alerting.

The Tablespace Check Monitor checks the relative size of individual tablespaces within database instances. The monitor reports and alerts when a tablespace in an instance exceeds user-specified thresholds.

The Advanced Metrics Monitor collects metrics for database performance. Some metrics are intended for tuning databases for long-term performance gains, rather than avoiding outages. The long-term probes are: Buffer Cache, Data Dictionary Cache, Disk Sort Ratio, Library Cache, and Redo Log. The monitor collects these metrics: Buffer Cache Hits Ratio, Data Dictionary Cache Hits Ratio, Library Cache Hits Ratio, Redo Log Space Request Ratio, Disk Sort Rate, Active Sessions, Oracle Blocking Sessions, Oracle Idle Sessions, and Response Time.

The Extendable Tablespace Check Monitor collects metrics for tablespace capacity. The monitor collects these metrics: Available Space, Used Space, Free Space, Percent Free, and Response Time. The monitor captures these metrics for each tablespace within a database, and uses these metrics for alerting and retains them for graphing within UIM.

MySQL and MariaDB

The Basic Checks Monitor determines whether a host that is running a database is available, determines whether it can log into a database, and evaluates a response based on the execution of a user-specified SQL script against an instance or a database. That is, the monitor runs queries, and matches for user-specified regular expressions for reporting and alerting.

The Advanced Metrics Monitor determines whether an instance is listening on a server’s ports, and checks performance values to establish the efficiency of an instance. The monitor collects these metrics: Uptime, Questions, Slow Queries, Open Tables, QPSA, Bytes Received, Bytes Sent, Delayed Insert Threads, Delayed Errors, Max Used Connections, Open Files, Open Streams, Table Locks Immediate, Table Locks Waited, Threads Cached, Threads Connected, Threads Running, QCache Queries In Cache, QCache Inserts, QCache Hits, QCache Lowmem Prunes, QCache Not Cached, QCache Free Memory, QCache Free Blocks, QCache Total Blocks, and Response Time. The monitor checks the performance metrics of databases and instances that are running on a system against user-specified thresholds. If an instance or a database is not responding, then a database can process queries, but the results demonstrate behavior that alerts to a problem.

The Replication Monitor collects these metrics: Seconds Behind Master, Slave IO State, Slave IO Running, Slave SQL Running, Last IO Error Number, Last IO Error, Last SQL Error Number, Last SQL Error, and Response Time.

The Status Monitor Plug-in gathers from a database using the ‘Status’ command Connections / Threads Connected, Open Tables, and Average Queries per Second.

PostgreSQL

The Basic Monitor Plug-in captures the availability of a database for reporting and alerting.

The Availability Monitor Plug-in determines whether it can connect to an instance, runs queries against an instance or database, and compares the response to user-specified text strings for reporting and alerting.

IBM DB2

The Basic Monitor Plug-in runs queries against an instance, matches for user-specified regular expressions, and reports and alerts on the response and the response time.

The Tablespace Monitor Plug-in collects metrics concerning tablespace performance for reporting and alerting. The monitor captures these metrics: Tablespace Used, Tablespace Available, Tablespace Used, Used Pages In All Tablespaces, and Free Pages in all Tablespaces. The monitor collects the data for all of the tablespaces that exist on an instance.

The Buffer Pool Monitor Plug-in collects metrics concerning buffer performance for reporting and alerting. The monitor captures these metrics: Non-Accessed Asynchronous Reads, Average Asynchronous Read Time, Average Synchronous Read Time, Average Read Time, Non-Prefetched Synchronous Reads, Average Asynchronous Write Time, Average Synchronous Write Time, Average Write Time, Asynchronous Write Percentage, Data Hit Ratio, Index Hit Ratio, Auxiliary Storage Objects Hit Ratio, Total Hit Ratio (Index, Data, and XDA), and Response Time.

Sybase

The Sybase Monitor determines whether a database is listening to a server’s ports, runs queries against an instance or database, matches for user-specified regular expressions, and reports and alerts on the response and the response time.

Takeaway

Uptime Infrastructure Monitor (UIM) provides comprehensive and unified monitoring and optimization of physical and virtual servers, applications, networks, and databases from a central dashboard. Also, use historical data to plan for future server capacity needs, and track service-level performance trends for compliance and reporting concerning service level agreements.

Anonymous