How NGINX Amplify Agent Works

NGINX Amplify Agent is a compact application written in Python. Its role is to collect various metrics and metadata and send them securely to the backend for storage and visualization.
You will need to install the Amplify Agent on all hosts that you have to monitor.
After proper installation, the agent will automatically start to report metrics, and you should see the real-time metrics data in the NGINX Amplify web interface in about 60 seconds or so.
NGINX Amplify can currently monitor and collect performance metrics for:
  1. Operating system (see the list of supported OS here)
  2. NGINX and NGINX Plus
  3. PHP-FPM
  4. MySQL
The agent considers an NGINX instance to be any running NGINX master process that has a unique path to the binary, and possibly a unique configuration.
Note. There's no need to manually add or configure anything in the web interface after installing the agent. When the agent is started, the metrics and the metadata are automatically reported to the Amplify backend, and visualized in the web interface.
When a system or an NGINX instance is removed from the infrastructure for whatever reason, and is no longer reporting (and therefore no longer necessary), you should manually delete it in the web interface. The "Remove object" button can be found in the metadata viewer popup — see User Interface below.

Metadata and Metrics Collection

NGINX Amplify Agent collects the following types of data:
  • NGINX metrics. The agent collects a lot of NGINX related metrics from stub_status, the NGINX Plus status API, the NGINX log files, and from the NGINX process state.
  • System metrics. These are various key metrics describing the system, e.g. CPU usage, memory usage, network traffic, etc.
  • PHP-FPM metrics. The agent can obtain metrics from the PHP-FPM pool status, if it detects a running PHP-FPM master process.
  • MySQL metrics. The agent can obtain metrics from the MySQL global status set of variables.
  • NGINX metadata. This is what describes your NGINX instances, and it includes package data, build information, the path to the binary, build configuration options, etc. NGINX metadata also includes the NGINX configuration elements.
  • System metadata. This is the basic information about the OS environment where the agent runs. This could be the hostname, uptime, OS flavor, and other data.
The agent will mostly use Python's psutil() to collect the metrics, but occasionally it may also invoke certain system utilities like ps(1).
While the agent is running on the host, it collects metrics at regular 20 second intervals. Metrics then get downsampled and sent to the Amplify backend once a minute.
Metadata is also reported every minute. Changes in the metadata can be examined through the Amplify web interface.
NGINX config updates are reported only when a configuration change is detected.
If the agent is not able to reach the Amplify backend to send the accumulated metrics, it will continue to collect metrics, and will send them over to Amplify as soon as connectivity is re-established. The maximum amount of data that could be buffered by the agent is about 2 hour's worth.

Detecting and Monitoring NGINX Instances

NGINX Amplify Agent is capable of detecting several types of NGINX instances:
  • Installed from a repository package
  • Built and installed manually
A separate instance of NGINX as seen by the agent would be the following:
  • A unique master process and its workers, started with an absolute path to a distinct NGINX binary
  • A master process running with a default config path, or with a custom path set in the command-line parameters
Note. The agent will try to detect and monitor all unique NGINX instances currently running on a host. Separate sets of metrics and metadata are collected for each unique NGINX instance.

Configuring NGINX for Metric Collection

In order to monitor an NGINX instance, the agent should be able to find the relevant NGINX master process first, and determine its key characteristics.

Metrics from stub_status

You need to define stub_status in your NGINX configuration for key NGINX graphs to appear in the web interface. If stub_status is already enabled, the agent should be able to locate it automatically.
If you're using NGINX Plus, then you need to configure either the stub_status module, or the NGINX Plus API module.
Without stub_status or the NGINX Plus status API, the agent will NOT be able to collect key NGINX metrics required for further monitoring and analysis.
Add the stub_status configuration as follows. You may also grab this config snippet here:
# cd /etc/nginx # grep -i include\.*conf nginx.conf include /etc/nginx/conf.d/*.conf; # cat > conf.d/stub_status.conf server { listen 127.0.0.1:80; server_name 127.0.0.1; location /nginx_status { stub_status on; allow 127.0.0.1; deny all; } } <Ctrl-D> # ls -la conf.d/stub_status.conf -rw-r--r-- 1 root root 162 Nov 4 02:40 conf.d/stub_status.conf # nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful # kill -HUP `cat /var/run/nginx.pid`
Don't forget to test your nginx configuration after you've added the stub_status section above. Make sure, there's no ambiguity with either listen or server_name configuration. The agent should be able to clearly identify the stub_status URL and will default to use 127.0.0.1 if the configuration is incomplete.
Note. If you use conf.d directory to keep common parts of your NGINX configuration that are then automatically included in the server sections across your NGINX config, do not use the snippet above. Instead you should configure stub_status manually within an appropriate location or server block.
Note. There's no need to use exactly the above example nginx_status URI for stub_status. The agent will determine the correct URI automatically upon parsing your NGINX configuration. Please make sure that the directory and the actual configuration file with stub_status are readable by the agent, otherwise the agent won't be able to correctly determine the stub_status URL. If the agent fails to find stub_status, please refer to the workaround described here.
Please make sure the stub_status ACL is correctly configured, especially if your system is IPv6-enabled. Test the reachability of stub_status metrics with wget(1) or curl(1). When testing, use the exact URL matching your NGINX configuration.
For more information about stub_status, please refer to the NGINX documentation here.
If everything is configured properly, you should see something along these lines when testing it with curl(1):
$ curl http://127.0.0.1/nginx_status Active connections: 2 server accepts handled requests 344014 344014 661581 Reading: 0 Writing: 1 Waiting: 1
If the above doesn't work, make sure to check where the requests to /nginx_status are being routed. In many cases other server blocks can be the reason you can't access stub_status.
The agent uses data from stub_status to calculate metrics related to server-wide HTTP connections and requests as described below:
nginx.http.conn.accepted = stub_status.accepts nginx.http.conn.active = stub_status.active - stub_status.waiting nginx.http.conn.current = stub_status.active nginx.http.conn.dropped = stub_status.accepts - stub_status.handled nginx.http.conn.idle = stub_status.waiting nginx.http.request.count = stub_status.requests nginx.http.request.current = stub_status.reading + stub_status.writing nginx.http.request.reading = stub_status.reading nginx.http.request.writing = stub_status.writing
For NGINX Plus the agent will automatically use similar metrics available from the status API.
For more information about the metric list, please refer to Metrics and Metadata.

Metrics from access.log and error.log

NGINX Amplify Agent will also collect more NGINX metrics from the access.log and the error.log files. In order to do that, the agent should be able to read the logs. Make sure that either the nginx user or the user defined in the NGINX config (such as www-data) can read the log files. Please also make sure that the log files are being written normally.
You don't have to specifically point the agent to either the NGINX configuration or the NGINX log files — it should detect their location automatically.
The agent will also try to detect the log format for a particular log, in order to be able to parse it properly and possibly extract even more useful metrics, e.g. $upstream_response_time.
Note. A number of metrics outlined in Metrics and Metadata will only be available if the corresponding variables are included in a custom access.log format used for logging requests. You can find a complete list of NGINX log variables here.

Using Syslog for Metric Collection

If you configured the agent for syslog metric collection (see below), make sure to add the following settings to the NGINX configuration:
  1. Check that you are using NGINX version 1.9.5 or newer (or NGINX Plus Release 8 or newer).
  2. Edit the NGINX configuration file and specify the syslog listener address as the first parameter to the access.log directive. Include the amplify tag, and your preferred log format:
    access_log syslog:server=127.0.0.1:12000,tag=amplify,severity=info main_ext;
    (see also how to extend the NGINX log format to collect additional metrics)
  3. Reload NGINX:
    # service nginx reload
    (see more here)
Note: To send the NGINX logs to both the existing logging facility and the Amplify Agent, include a separate access.log directive for each destination.

What to Check if the Agent Isn't Reporting Metrics

After you install and start the agent, normally it should just start reporting right away, pushing aggregated data to the Amplify backend at regular 1 minute intervals. It'll take about a minute for a new system to appear in the Amplify web interface.
If you don't see the new system or NGINX in the web interface, or (some) metrics aren't being collected, please check the following:
  1. The Amplify Agent package has been successfully installed, and no warnings were seen upon the installation.
  2. The amplify-agent process is running and updating its log file.
  3. The agent is running under the same user as your NGINX worker processes.
  4. The NGINX is started with an absolute path. The agent can't detect NGINX instances launched with a relative path (e.g. "./nginx").
  5. The user ID that is used by the agent and the NGINX , can run ps(1) to see all system processes. If ps(1) is restricted for non-privileged users, the agent won't be able to find and properly detect the NGINX master process.
  6. The time is set correctly. If the time on the system where the agent runs is ahead or behind the world's clock, you won't be able to see the graphs.
  7. stub_status is properly configured, and the stub_status module is included in the NGINX build (this can be checked with nginx -V).
  8. NGINX access.log and error.log files are readable by the user nginx (or by the user set in NGINX config).
  9. All NGINX configuration files are readable by the agent user ID (check owner, group and permissions).
  10. Extra configuration steps have been performed as required for the additional metrics to be collected.
  11. The system DNS resolver is correctly configured, and receiver.amplify.nginx.com can be successfully resolved.
  12. Oubound TLS/SSL from the system to receiver.amplify.nginx.com is not restricted. This can be checked with curl(1). Configure a proxy server for the agent if required.
  13. selinux(8), apparmor(7) or grsecurity are not interfering with the metric collection. E.g. for selinux(8) check /etc/selinux/config, try setenforce 0 temporarily and see if it improves the situation for certain metrics.
  14. Some VPS providers use hardened Linux kernels that may restrict non-root users from accessing /proc and /sys. Metrics describing system and NGINX disk I/O are usually affected. There is no an easy workaround for this except for allowing the agent to run as root. Sometimes fixing permissions for /proc and /sys/block may work.

NGINX Configuration Analysis

NGINX Amplify Agent is able to automatically find all relevant NGINX configuration files, parse them, extract their logical structure, and send the associated JSON data to the Amplify backend for further analysis and reporting. For more information on configuration analysis, please see the Analyzer section below.
After the agent finds a particular NGINX configuration, it then automatically starts to keep track of its changes. When a change is detected with NGINX — e.g. a master process restarts, or the NGINX config is edited, an update is sent to the Amplify backend.
Note. The agent doesn't ever send the raw unprocessed config files to the backend system. In addition, the following directives in the NGINX configuration are never analyzed — and their parameters aren't exported to the SaaS backend: ssl_certificate_key, ssl_client_certificate, ssl_password_file, ssl_stapling_file, ssl_trusted_certificate, auth_basic_user_file, secure_link_secret.

Source Code for NGINX Amplify Agent

NGINX Amplify Agent is an open source application. It is licensed under the 2-clause BSD license, and is available here:
  • Sources: https://github.com/nginxinc/nginx-amplify-agent
  • Public package repository: http://packages.amplify.nginx.com
  • Install script for Linux: https://github.com/nginxinc/nginx-amplify-agent/raw/master/packages/install.sh
  • A script to install the agent when the package is not available: https://raw.githubusercontent.com/nginxinc/nginx-amplify-agent/master/packages/install-source.sh