Metrics and Metadata
Some additional metrics for NGINX monitoring will only be reported if the NGINX configuration file is modified accordingly. See Additional NGINX Metrics below, and pay attention to the Source and Variable fields in the metric descriptions that follow. OS Metrics
Type: internal, integer
Description: 1 - agent is up, 0 - agent is down.
- amplify.agent.cpu.system
- amplify.agent.cpu.user
Type: gauge, percent
Description: CPU utilization percentage observed from the agent process.
- amplify.agent.mem.rss
- amplify.agent.mem.vms
Type: gauge, bytes
Description: Memory utilized by the agent process.
- system.cpu.idle
- system.cpu.iowait
- system.cpu.system
- system.cpu.user
Type: gauge, percent
Description: System CPU utilization.
Type: gauge, percent
Description: System CPU stolen. Represents time when the real CPU was not available to
the current VM.
- system.disk.free
- system.disk.total
- system.disk.used
Type: gauge, bytes
Description: System disk usage statistics.
Type: gauge, percent
Description: System disk usage statistics, percentage.
- system.io.iops_r
- system.io.iops_w
Type: counter, integer
Description: Number of reads or writes per sampling window.
- system.io.kbs_r
- system.io.kbs_w
Type: counter, kilobytes
Description: Number of kilobytes read or written.
- system.io.wait_r
- system.io.wait_w
Type: gauge, milliseconds
Description: Time spent reading from or writing to disk.
- system.load.1
- system.load.5
- system.load.15
Type: gauge, float
Description: Number of processes in the system run queue, averaged over the last 1, 5,
and 15 min.
- system.mem.available
- system.mem.buffered
- system.mem.cached
- system.mem.free
- system.mem.shared
- system.mem.total
- system.mem.used
Type: gauge, bytes
Description: Statistics about system memory usage.
Type: gauge, percent
Description: Statistics about system memory usage, percentage.
- system.net.bytes_rcvd
- system.net.bytes_sent
Type: counter, bytes
Description: Network I/O statistics. Number of bytes received or sent, per network
interface.
- system.net.drops_in.count
- system.net.drops_out.count
Type: counter, integer
Description: Network I/O statistics. Total number of inbound or outbound packets
dropped, per network interface.
- system.net.packets_in.count
- system.net.packets_out.count
Type: counter, integer
Description: Network I/O statistics. Number of packets received or sent, per network
interface.
- system.net.packets_in.error
- system.net.packets_out.error
Type: counter, integer
Description: Network I/O statistics. Total number of errors while receiving or sending,
per network interface.
- system.net.listen_overflows
Type: counter, integer
Description: Number of times the listen queue of a socket overflowed.
- system.swap.free
- system.swap.total
- system.swap.used
Type: gauge, bytes
Description: System swap memory statistics.
Type: gauge, percent
Description: System swap memory statistics, percentage.
NGINX Metrics
HTTP Connections and Requests
- nginx.http.conn.accepted
- nginx.http.conn.dropped
Type: counter, integer
Description: NGINX-wide statistics describing HTTP connections.
Source: stub_status (or NGINX Plus status API)
- nginx.http.conn.active
- nginx.http.conn.current
- nginx.http.conn.idle
Type: gauge, integer
Description: NGINX-wide statistics describing HTTP connections.
Source: stub_status (or NGINX Plus status API)
Type: counter, integer
Description: Total number of client requests.
Source: stub_status (or NGINX Plus status API)
- nginx.http.request.current
- nginx.http.request.reading
- nginx.http.request.writing
Type: gauge, integer
Description: Number of currently active requests (reading and writing). Number of
requests reading headers or writing responses to clients.
Source: stub_status (or NGINX Plus status API)
- nginx.http.request.malformed
Type: counter, integer
Description: Number of malformed requests.
Source: access.log
- nginx.http.request.body_bytes_sent
Type: counter, integer
Description: Number of bytes sent to clients, not counting response headers.
Source: access.log
HTTP Methods
- nginx.http.method.get
- nginx.http.method.head
- nginx.http.method.post
- nginx.http.method.put
- nginx.http.method.delete
- nginx.http.method.options
Type: counter, integer
Description: Statistics about observed request methods.
Source: access.log
HTTP Status Codes
- nginx.http.status.1xx
- nginx.http.status.2xx
- nginx.http.status.3xx
- nginx.http.status.4xx
- nginx.http.status.5xx
Type: counter, integer
Description: Number of requests with HTTP status codes per class.
Source: access.log
- nginx.http.status.403
- nginx.http.status.404
- nginx.http.status.500
- nginx.http.status.502
- nginx.http.status.503
- nginx.http.status.504
Type: counter, integer
Description: Number of requests with specific HTTP status codes above.
Source: access.log
- nginx.http.status.discarded
Type: counter, integer
Description: Number of requests finalized with status code 499 which is logged when the
client closes the connection.
Source: access.log
HTTP Protocol Versions
- nginx.http.v0_9
- nginx.http.v1_0
- nginx.http.v1_1
- nginx.http.v2
Type: counter, integer
Description: Number of requests using a specific version of the HTTP protocol.
Source: access.log
NGINX Process Metrics
Type: gauge, integer
Description: Number of NGINX worker processes observed.
- nginx.workers.cpu.system
- nginx.workers.cpu.total
- nginx.workers.cpu.user
Type: gauge, percent
Description: CPU utilization percentage observed for NGINX worker processes.
Type: gauge, integer
Description: Number of file descriptors utilized by NGINX worker processes.
- nginx.workers.io.kbs_r
- nginx.workers.io.kbs_w
Type: counter, integer
Description: Number of kilobytes read from or written to disk by NGINX worker processes.
- nginx.workers.mem.rss
- nginx.workers.mem.vms
Type: gauge, bytes
Description: Memory utilized by NGINX worker processes.
- nginx.workers.mem.rss_pct
Type: gauge, percent
Description: Memory utilization percentage for NGINX worker processes.
- nginx.workers.rlimit_nofile
Type: gauge, integer
Description: Hard limit on the number of file descriptors as seen by NGINX worker
processes.
Additional NGINX Metrics
NGINX Amplify Agent can collect a number of additional useful metrics described below. To enable these metrics, please make the following configuration changes. More predefined graphs will be added to the Graphs page if the agent finds additional metrics. With the required log format configuration, you'll be able to build more specific custom graphs.
The access.log log format should include an extended set of NGINX variables. Please add a new log format or modify the existing one — and use it with the access_log directives in your NGINX configuration. log_format main_ext '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'"$host" sn="$server_name" '
'rt=$request_time '
'ua="$upstream_addr" us="$upstream_status" '
'ut="$upstream_response_time" ul="$upstream_response_length" '
'cs=$upstream_cache_status' ;
Here's how you may use the extended log format with your access log configuration:
access_log /var/log/nginx/access.log main_ext;
Note. Please bear in mind that by default the agent will process all access logs that are found in your log directory. If you define a new log file with the extended log format that will contain the entries being already logged to another access log, your metrics might be counted twice. Please refer to the agent configuration section above to learn how to exclude specific log files from processing.
The error.log log level should be set to warn. error_log /var/log/nginx/error.log warn;
Note. Don't forget to reload your NGINX configuration with either kill -HUP or service nginx reload.
Here is the list of additional metrics that can be collected from the NGINX log files:
- nginx.http.request.bytes_sent
Type: counter, integer
Description: Number of bytes sent to clients.
Source: access.log (requires custom log format)
Variable: $bytes_sent
- nginx.http.request.length
Type: gauge, integer
Description: Request length, including request line, header, and body.
Source: access.log (requires custom log format)
Variable: $request_length
- nginx.http.request.time
- nginx.http.request.time.count
- nginx.http.request.time.max
- nginx.http.request.time.median
- nginx.http.request.time.pctl95
Type: gauge, seconds.milliseconds
Description: Request processing time — time elapsed between reading the first bytes from
the client and writing a log entry after the last bytes were sent.
Source: access.log (requires custom log format)
Variable: $request_time
- nginx.http.request.buffered
Type: counter, integer
Description: Number of requests that were buffered to disk.
Source: error.log (requires 'warn' log level)
Type: gauge, float
Description: Achieved compression ratio, calculated as the ratio between the original
and compressed response sizes.
Source: access.log (requires custom log format)
Variable: $gzip_ratio
Upstream Metrics
- nginx.upstream.connect.time
- nginx.upstream.connect.time.count
- nginx.upstream.connect.time.max
- nginx.upstream.connect.time.median
- nginx.upstream.connect.time.pctl95
Type: gauge, seconds.milliseconds
Description: Time spent on establishing connections with upstream servers. With SSL, it
also includes time spent on the handshake.
Source: access.log (requires custom log format)
Variable: $upstream_connect_time
- nginx.upstream.header.time
- nginx.upstream.header.time.count
- nginx.upstream.header.time.max
- nginx.upstream.header.time.median
- nginx.upstream.header.time.pctl95
Type: gauge, seconds.milliseconds
Description: Time spent on receiving response headers from upstream servers.
Source: access.log (requires custom log format)
Variable: $upstream_header_time
- nginx.upstream.response.buffered
Type: counter, integer
Description: Number of upstream responses buffered to disk.
Source: error.log (requires 'warn' log level)
- nginx.upstream.request.count
- nginx.upstream.next.count
Type: counter, integer
Description: Number of requests that were sent to upstream servers.
Source: access.log (requires custom log format)
Variable: $upstream_*
- nginx.upstream.request.failed
- nginx.upstream.response.failed
Type: counter, integer
Description: Number of requests and responses that failed while proxying.
Source: error.log (requires 'error' log level)
- nginx.upstream.response.length
Type: gauge, bytes
Description: Average length of the responses obtained from the upstream servers.
Source: access.log (requires custom log format)
Variable: $upstream_response_length
- nginx.upstream.response.time
- nginx.upstream.response.time.count
- nginx.upstream.response.time.max
- nginx.upstream.response.time.median
- nginx.upstream.response.time.pctl95
Type: gauge, seconds.milliseconds
Description: Time spent on receiving responses from upstream servers.
Source: access.log (requires custom log format)
Variable: $upstream_response_time
- nginx.upstream.status.1xx
- nginx.upstream.status.2xx
- nginx.upstream.status.3xx
- nginx.upstream.status.4xx
- nginx.upstream.status.5xx
Type: counter, integer
Description: Number of responses from upstream servers with specific HTTP status codes.
Source: access.log (requires custom log format)
Variable: $upstream_status
Cache Metrics
- nginx.cache.bypass
- nginx.cache.expired
- nginx.cache.hit
- nginx.cache.miss
- nginx.cache.revalidated
- nginx.cache.stale
- nginx.cache.updating
Type: counter, integer
Description: Various statistics about NGINX cache usage.
Source: access.log (requires custom log format)
Variable: $upstream_cache_status
NGINX Plus Metrics
In NGINX Plus a number of additional metrics describing various aspects of NGINX performance are available. The API module in NGINX Plus is responsible for collecting and exposing all of the additional counters and gauges. The NGINX Plus metrics currently supported by the agent are described below. The NGINX Plus metrics have the "plus" prefix in their names.
Some of the NGINX Plus metrics extracted from the connections and the requests datasets are used to generate the following server-wide metrics (instead of using the stub_status metrics):
nginx.http.conn.accepted = connections.accepted
nginx.http.conn.active = connections.active
nginx.http.conn.current = connections.active + connections.idle
nginx.http.conn.dropped = connections.dropped
nginx.http.conn.idle = connections.idle
nginx.http.request.count = requests.total
nginx.http.request.current = requests.current
The NGINX Plus metrics below are collected per zone. When configuring a graph using these metrics, please make sure to pick the correct server, upstream or cache zone. A more granular peer-specific breakdown of the metrics below is currently not supported in NGINX Amplify.
A cumulative metric set is also maintained internally by summing up the per-zone metrics. If you don't configure a specific zone when building graphs, this will result in an "all zones" visualization. E.g. for something like plus.http.status.2xx omitting zone will display the instance-wide sum of the successful requests across all zones.
Server Zone Metrics
- plus.http.request.count
- plus.http.response.count
Type: counter, integer
Description: Number of client requests received, and responses sent to clients.
Source: NGINX Plus status API
- plus.http.request.bytes_rcvd
- plus.http.request.bytes_sent
Type: counter, bytes
Description: Number of bytes received from clients, and bytes sent to clients.
Source: NGINX Plus status API
- plus.http.status.1xx
- plus.http.status.2xx
- plus.http.status.3xx
- plus.http.status.4xx
- plus.http.status.5xx
Type: counter, integer
Description: Number of responses with status codes 1xx, 2xx, 3xx, 4xx, and 5xx.
Source: NGINX Plus status API
- plus.http.status.discarded
Type: counter, integer
Description: Number of requests completed without sending a response.
Source: NGINX Plus status API
Type: counter, integer
Description: Total number of successful SSL handshakes.
Source: NGINX Plus status API
Type: counter, integer
Description: Total number of failed SSL handshakes.
Source: NGINX Plus status API
Type: counter, integer
Description: Total number of session reuses during SSL handshake.
Source: NGINX Plus status API
Upstream Zone Metrics
Type: gauge, integer
Description: Current number of live ("up") upstream servers in an upstream group. If
graphed/monitored without specifying an upstream, it's the current
number of all live upstream servers in all upstream groups.
Source: NGINX Plus status API
- plus.upstream.request.count
- plus.upstream.response.count
Type: counter, integer
Description: Number of client requests forwarded to the upstream servers, and responses obtained.
Source: NGINX Plus status API
- plus.upstream.conn.active
Type: gauge, integer
Description: Current number of active connections to the upstream servers.
Source: NGINX Plus status API
- plus.upstream.conn.keepalive
Type: gauge, integer
Description: Сurrent number of idle keepalive connections.
Source: NGINX Plus status API
Type: gauge, integer
Description: Current number of servers removed from the group but still processing
active client requests.
Source: NGINX Plus status API
- plus.upstream.bytes_rcvd
- plus.upstream.bytes_sent
Type: counter, integer
Description: Number of bytes received from the upstream servers, and bytes sent.
Source: NGINX Plus status API
- plus.upstream.status.1xx
- plus.upstream.status.2xx
- plus.upstream.status.3xx
- plus.upstream.status.4xx
- plus.upstream.status.5xx
Type: counter, integer
Description: Number of responses from the upstream servers with status codes 1xx, 2xx,
3xx, 4xx, and 5xx.
Source: NGINX Plus status API
- plus.upstream.header.time
- plus.upstream.header.time.count
- plus.upstream.header.time.max
- plus.upstream.header.time.median
- plus.upstream.header.time.pctl95
Type: gauge, seconds.milliseconds
Description: Average time to get the response header from the upstream servers.
Source: NGINX Plus status API
- plus.upstream.response.time
- plus.upstream.response.time.count
- plus.upstream.response.time.max
- plus.upstream.response.time.median
- plus.upstream.response.time.pctl95
Type: gauge, seconds.milliseconds
Description: Average time to get the full response from the upstream servers.
Source: NGINX Plus status API
- plus.upstream.fails.count
- plus.upstream.unavail.count
Type: counter, integer
Description: Number of unsuccessful attempts to communicate with upstream servers, and
how many times upstream servers became unavailable for client requests.
Source: NGINX Plus status API
- plus.upstream.health.checks
- plus.upstream.health.fails
- plus.upstream.health.unhealthy
Type: counter, integer
Description: Number of performed health check requests, failed health checks, and
how many times the upstream servers became unhealthy.
Source: NGINX Plus status API
Type: gauge, integer
Description: Current number of queued requests.
Source: NGINX Plus status API
- plus.upstream.queue.overflows
Type: counter, integer
Description: Number of requests rejected due to queue overflows.
Source: NGINX Plus status API
Cache Zone Metrics
- plus.cache.bypass
- plus.cache.bypass.bytes
- plus.cache.expired
- plus.cache.expired.bytes
- plus.cache.hit
- plus.cache.hit.bytes
- plus.cache.miss
- plus.cache.miss.bytes
- plus.cache.revalidated
- plus.cache.revalidated.bytes
- plus.cache.size
- plus.cache.stale
- plus.cache.stale.bytes
- plus.cache.updating
- plus.cache.updating.bytes
Type: counter, integer; counter, bytes
Description: Various statistics about NGINX Plus cache usage.
Source: NGINX Plus status API
Stream Zone Metrics
Type: gauge, integer
Description: Current number of client connections that are currently being processed.
Source: NGINX Plus status API
- plus.stream.conn.accepted
Type: counter, integer
Description: Total number of connections accepted from clients.
Source: NGINX Plus status API
- plus.stream.status.2xx
- plus.stream.status.4xx
- plus.stream.status.5xx
Type: counter, integer
Description: Number of sessions completed with status codes 2xx, 4xx, or 5xx.
Source: NGINX Plus status API
Type: counter, integer
Description: Total number of connections completed without creating a session.
Source: NGINX Plus status API
- plus.stream.bytes_rcvd
- plus.stream.bytes_sent
Type: counter, integer
Description: Number of bytes received from clients, and bytes sent.
Source: NGINX Plus status API
- plus.stream.upstream.peers
Type: gauge, integer
Description: Current number of live ("up") upstream servers in an upstream group.
Source: NGINX Plus status API
- plus.stream.upstream.conn.active
Type: gauge, integer
Description: Current number of connections.
Source: NGINX Plus status API
- plus.stream.upstream.conn.count
Type: counter, integer
Description: Total number of client connections forwarded to this server.
Source: NGINX Plus status API
- plus.stream.upstream.conn.time
- plus.stream.upstream.conn.time.count
- plus.stream.upstream.conn.time.max
- plus.stream.upstream.conn.time.median
- plus.stream.upstream.conn.time.pctl95
Type: timer, integer
Description: Average time to connect to an upstream server.
Source: NGINX Plus status API
- plus.stream.upstream.conn.ttfb
Type: timer, integer
Description: Average time to receive the first byte of data.
Source: NGINX Plus status API
- plus.stream.upstream.response.time
Type: timer, integer
Description: Average time to receive the last byte of data.
Source: NGINX Plus status API
- plus.stream.upstream.bytes_sent
- plus.stream.upstream.bytes_rcvd
Type: counter, integer
Description: Number of bytes received from upstream servers, and bytes sent.
Source: NGINX Plus status API
- plus.stream.upstream.fails.count
- plus.stream.upstream.unavail.count
Type: counter, integer
Description: Number of unsuccessful attempts to communicate with upstream servers, and
how many times upstream servers became unavailable for client requests.
Source: NGINX Plus status API
- plus.stream.upstream.health.checks
- plus.stream.upstream.health.fails
- plus.stream.upstream.health.unhealthy
Type: counter, integer
Description: Number of performed health check requests, failed health checks, and
how many times the upstream servers became unhealthy.
Source: NGINX Plus status API
- plus.stream.upstream.zombies
Type: gauge, integer
Description: Current number of servers removed from the group but still
processing active client connections.
Source: NGINX Plus status API
Slab Zone Metrics
Type: gauge, integer
Description: Сurrent number of used memory pages.
Source: NGINX Plus status API
Type: gauge, integer
Description: Сurrent number of free memory pages.
Source: NGINX Plus status API
Type: gauge, integer
Description: Sum of free and used memory pages above.
Type: gauge, percentage
Description: Percentage of free pages.
Other metrics
PHP-FPM metrics
You can also monitor your PHP-FPM applications with NGINX Amplify. The agent should run in the same process environment as PHP-FPM, and be able to find the php-fpm processes with ps(1), otherwise the PHP-FPM metric collection won't work.
When the agent finds a PHP-FPM master process, it tries to auto-detect the path to the PHP-FPM configuration. When the PHP-FPM configuration is found, the agent will look up the pool definitions, and the corresponding pm.status_path directives.
The agent will find all pools and status URIs currently configured. The agent then queries the PHP-FPM pool status(es) via FastCGI. There's no need to define HTTP proxy in your NGINX configuration that will point to the PHP-FPM status URIs.
To start monitoring PHP-FPM, follow the steps below:
Make sure that your PHP-FPM status is enabled for at least one pool — if it's not, uncomment the pm.status_path directive for the pool. For PHP7 on Ubuntu, look inside the /etc/php/7.0/fpm/pool.d directory to find the pool configuration files. After you've uncommented the pm.status_path, please make sure to restart PHP-FPM.
# service php7.0-fpm restart
- This step is very important! Check that NGINX, the Amplify Agent, and the PHP-FPM workers are all run under the same user ID (e.g. www-data). You may have to change the used ID for the nginx workers, fix the nginx directories permissions, and then restart the agent too. If there are multiple PHP-FPM pools configured with different user IDs, make sure the agent's user ID is included in the group IDs of the PHP-FPM workers. This is required in order for the agent to access the PHP-FPM pool socket when querying for metrics.
Check that the listen socket for the PHP-FPM pool you want to monitor, and for which you enabled pm.status_path, is properly configured with listen.owner and listen.group. Look for the following directives inside the pool configuration file.
listen.owner = www-data
listen.group = www-data
listen.mode = 0660
Check that the PHP-FPM listen socket for the pool exists and has the right permissions.
# ls -la /var/run/php/php7.0-fpm.sock
srw-rw---- 1 www-data www-data 0 May 18 14:02 /var/run/php/php7.0-fpm.sock
Check that you can query the PHP-FPM status for the pool from the command line:
# SCRIPT_NAME=/status SCRIPT_FILENAME=/status QUERY_STRING= REQUEST_METHOD=GET cgi-fcgi -bind -connect /var/run/php/php7.0-fpm.sock
and that the above command returns a valid set of PHP-FPM metrics.
Note. The cgi-fcgi tool has to be installed separately, usually from the libfcgi-dev package. This tool is not required for the agent to collect and report PHP-FPM metrics, however it can be used to quickly diagnose possible issues with PHP-FPM metric collection.
- If your PHP-FPM is configured to use a TCP socket instead of a Unix domain socket, make sure you can query the PHP-FPM metrics manually with cgi-fcgi. Double check that your TCP socket configuration is secure (ideally, PHP-FPM pool listening on 127.0.0.1, and listen.allowed_clients enabled as well).
- Update the agent to the most recent version.
Check that the following options are set in /etc/amplify-agent/agent.conf
[extensions]
phpfpm = True
Restart the agent.
# service amplify-agent restart
The agent should be able to detect the PHP-FPM master and workers, obtain the access to status, and collect the necessary metrics.
With all of the above successfully configured, the end result should be an additional tab displayed on the Graphs page, with the pre-defined visualization of the PHP-FPM metrics. The PHP-FPM metrics on the Graphs page are cumulative, across all automatically detected pools. If you need per-pool graphs, go to Dashboards and create custom graphs per pool. Here is the list of caveats to look for if the PHP-FPM metrics are not being collected:
- No status enabled for any of the pools.
- Different user IDs used by the agent and the PHP-FPM workers, or lack of a single group (when using PHP-FPM with a Unix domain socket).
- Wrong permissions configured for the PHP-FPM listen socket (when using PHP-FPM with a Unix domain socket).
- Agent can't connect to the TCP socket (when using PHP-FPM with a TCP socket).
- Agent can't parse the PHP-FPM configuration. A possible workaround is to not have any ungrouped directives. Try to move any ungrouped directives under [global] and pool section headers.
If checking the above issues didn't help, please enable the agent's debug log, restart the agent, wait a few minutes, and then create an issue at nginx-amplify-agent repo along with the relevant debug log. Below is the list of supported PHP-FPM metrics.
Type: counter, integer
Description: The number of requests accepted by the pool.
Source: PHP-FPM status (accepted conn)
Type: gauge, integer
Description: The number of requests in the queue of pending connections.
Source: PHP-FPM status (listen queue)
Type: gauge, integer
Description: The maximum number of requests in the queue of pending connections since FPM has started.
Source: PHP-FPM status (max listen queue)
Type: gauge, integer
Description: The size of the socket queue of pending connections.
Source: PHP-FPM status (listen queue len)
Type: gauge, integer
Description: The number of idle processes.
Source: PHP-FPM status (idle processes)
Type: gauge, integer
Description: The number of active processes.
Source: PHP-FPM status (active processes)
Type: gauge, integer
Description: The number of idle + active processes.
Source: PHP-FPM status (total processes)
Type: gauge, integer
Description: The maximum number of active processes since FPM has started.
Source: PHP-FPM status (max active processes)
Type: gauge, integer
Description: The number of times, the process limit has been reached.
Source: PHP-FPM status (max children reached)
Type: counter, integer
Description: The number of requests that exceeded request_slowlog_timeout value.
Source: PHP-FPM status (slow requests)
MySQL metrics
Version 1.1.0 and above of the Amplify agent has a plugin for monitoring MySQL databases. Again, the agent should run in the same process environment as MySQL, and be able to find the mysqld processes with ps(1), otherwise the MySQL metric collection won't work.
The agent doesn't try to find and parse any existing MySQL configuration files. In order for the agent to connect to MySQL and collect the metrics, a few simple configuration steps should be performed.
To start monitoring MySQL, follow the instructions below.
Create a new user for the Amplify agent.
$ mysql -u root -p
[..]
mysql> CREATE USER 'amplify-agent'@'localhost' IDENTIFIED BY 'xxxxxx';
Query OK, 0 rows affected (0.01 sec)
Check that the user can read MySQL metrics.
$ mysql -u amplify-agent -p
..
mysql> show global status;
+-----------------------------------------------+--------------------------------------------------+
| Variable_name | Value |
+-----------------------------------------------+--------------------------------------------------+
| Aborted_clients | 0 |
..
| Uptime_since_flush_status | 1993 |
+-----------------------------------------------+--------------------------------------------------+
353 rows in set (0.01 sec)
Note. The agent doesn't use mysql(1) for metric collection, however it implements a similar query mechanism via a Python module.
- Update the agent to the most recent version.
Add the following to /etc/amplify-agent/agent.conf
[extensions]
..
mysql = True
[mysql]
#host =
#port =
unix_socket = /var/run/mysqld/mysqld.sock
user = amplify-agent
password = xxxxxx
where the password option mirrors the password from the step #1 above.
Restart the agent.
# service amplify-agent restart
With the above configuration steps the agent should be able to detect the MySQL master, obtain the access to status, and collect the necessary metrics. The end result should be an additional tab displayed on the Graphs page, with the pre-defined visualization of the key MySQL metrics. If the MySQL metrics are not visible check if unix_socket under \[mysql\] in agent.conf corresponds to the path of mysql.sock file.
If the above didn't work, please enable the agent's debug log, restart the agent, wait a few minutes, and then create an issue at nginx-amplify-agent repo along with the relevant debug log. Below is the list of supported MySQL metrics.
Type: counter, integer
Description: The number of connection attempts (successful or not) to the MySQL server.
Source: SHOW GLOBAL STATUS LIKE "Connections";
Type: counter, integer
Description: The number of statements executed by the server. See MySQL reference manual for details.
Source: SHOW GLOBAL STATUS LIKE "Questions";
Type: counter, integer
Description: The number of times a select statement has been executed.
Source: SHOW GLOBAL STATUS LIKE "Com_select";
Type: counter, integer
Description: The number of times an insert statement has been executed.
Source: SHOW GLOBAL STATUS LIKE "Com_insert";
Type: counter, integer
Description: The number of times an update statement has been executed.
Source: SHOW GLOBAL STATUS LIKE "Com_update";
Type: counter, integer
Description: The number of times a delete statement has been executed.
Source: SHOW GLOBAL STATUS LIKE "Com_delete";
Type: counter, integer
Description: Sum of insert, update, and delete counters above.
Type: counter, integer
Description: The number of times a commit statement has been executed.
Source: SHOW GLOBAL STATUS LIKE "Com_commit";
- mysql.global.slow_queries
Type: counter, integer
Description: The number of queries that have taken more than long_query_time seconds.
Source: SHOW GLOBAL STATUS LIKE "Slow_queries";
Type: counter, integer
Description: The number of seconds that the server has been up.
Source: SHOW GLOBAL STATUS LIKE "Uptime";
- mysql.global.aborted_connects
Type: counter, integer
Description: The number of failed attempts to connect to the MySQL server.
Source: SHOW GLOBAL STATUS LIKE "Aborted_connects";
- mysql.global.innodb_buffer_pool_read_requests
Type: counter, integer
Description: The number of logical read requests.
Source: SHOW GLOBAL STATUS LIKE "Innodb_buffer_pool_read_requests";
- mysql.global.innodb_buffer_pool_reads
Type: counter, integer
Description: The number of logical reads that InnoDB could not satisfy from the buffer
pool, and had to read directly from disk.
Source: SHOW GLOBAL STATUS LIKE "Innodb_buffer_pool_reads";
- mysql.global.innodb_buffer_pool.hit_ratio
Type: gauge, percentage
Description: Hit ratio reflecting the efficiency of the InnoDB buffer pool.
- mysql.global.innodb_buffer_pool_pages_total
Type: gauge, integer
Description: The total size of the InnoDB buffer pool, in pages.
Source: SHOW GLOBAL STATUS LIKE "Innodb_buffer_pool_pages_total";
- mysql.global.innodb_buffer_pool_pages_free
Type: gauge, integer
Description: The number of free pages in the InnoDB buffer pool.
Source: SHOW GLOBAL STATUS LIKE "Innodb_buffer_pool_pages_free";
- mysql.global.innodb_buffer_pool_util
Type: gauge, percentage
Description: InnoDB buffer pool utilization.
- mysql.global.threads_connected
Type: gauge, integer
Description: The number of currently open connections.
Source: SHOW GLOBAL STATUS LIKE "Threads_connected";
- mysql.global.threads_running
Type: gauge, integer
Description: The number of threads that are not sleeping.
Source: SHOW GLOBAL STATUS LIKE "Threads_running";