System Monitoring

Overview

There are common reasons for system monitoring:

Performance optimization: monitoring of CPU and memory usage can help identify and tackle important performance issues. As an example, an abnormally high CPU may be a signal for a hardware upgrade or a software optimization.
Resource management: analyze your system demands to plan upscales and performance spikes.
System maintenance: get actionable data in time to ensure the system meets users requirements.
Security: monitoring of network metrics can help identify and eliminate potential security threats.

Aggregator can be configured to enable application and standard JVM metrics to monitor your system - refer to Description of Metrics to learn more. You can use Prometheus to collect and store, and Grafana to visualize them.

Step 1: Enable Metrics

To enable application and JVM metrics, configure Aggregator via Java System Properties or Aggregator admin.properties configuration file. When configured, you can access metrics via a special endpoint /agg/api/metrics (ex. http://localhost:port/agg/api/metrics) in the format compatible with Prometheus.

Java System Properties

To enable system metrics, add these variables to your docker or docker-compose:

Version 5.6
Version 5.5

# docker-compose.yml example

environment:
  - JAVA_OPTS=
    -DAggregator.metrics.enable=true                # enable metrics for Aggregator
    -DAggregator.metrics.enableJvmMetrics=true      # enable JVM metrics (disabled by default)
    -DAggregator.metrics.enableTomcatMetrics=true   # enable Tomcat metrics (disabled by default)

admin.properties

To enable system metrics, specify the following parameters in admin.properties configuration file:

   Aggregator.metrics.enable=true               # enable metrics for Aggregator
   Aggregator.metrics.enableJvmMetrics=true     # enable JVM metrics (disabled by default)
   Aggregator.metrics.enableTomcatMetrics=true  # enable Tomcat metrics (disabled by default)

# docker-compose.yml example

    environment:
      - JAVA_OPTS=
        -DAggregator.metrics.enable=true # enable both application and JVM metrics

admin.properties

To enable system metrics, specify the following parameters in admin.properties configuration file:

Aggregator.enableMetrics=true # enable both application and JVM metrics

Kubernetes Configuration

To enable all types of metrics in Kubernetes:

Enable serviceMonitor in Aggregator values.yaml. Note, that this service is enabled by default.

# include this in Aggregator values.yaml to enable system metrics monitoring
  serviceMonitor:
    enabled: true
    namespace: monitoring
    interval: "30s"
    labels:
      monitoring: application

info

Refer to Deployment in Kubernetes to learn more.

Step 2: Get Metrics

You can use Prometheus to collect and store system metrics. To periodically scrape metrics from Aggregator, add the Aggregator endpoint /agg/api/metrics to metrics_path section in Prometheus configuration file:

# Example of configuration for Prometheus

global:
  scrape_interval:     1s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'external-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'Aggregator'
  
    metrics_path: /agg/api/metrics

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 1s

    static_configs:
      - targets: ['localhost:8011']

info

To learn how to install and configure Prometheus, refer to Prometheus Installation and Prometheus Getting Started.

Step 3: Use Metrics

You can visualize the collected system metrics in Grafana dashboards:

Install Grafana
Install and configure Prometheus data source
Add and configure Grafana dashboards
Download pre-configured dashboards for JVM and Aggregator metrics

This is a sample illustration of the JVM Actuator Grafana dashboard with Aggregator JVM metrics:

Disable JVM Metrics

There are two ways to disable JVM metrics, depending on the method you used to enable them:

Java System Properties

To disable JVM metrics, add these variables to your docker or docker-compose file:

# docker-compose.yml example

    environment:
      - JAVA_OPTS=
        -DAggregator.metrics.disableJvmMetrics=true # enable or disable JVM metrics

admin.properties

To disable JVM metrics, specify the following parameters in admin.properties configuration file:

Aggregator.metricsService=QSMetricsServiceInfo
Aggregator.metricsService.disableJvmMetrics=true # enable or disable JVM metrics

Description of Metrics

Aggregator metrics

Aggregator Application Metric	Description
aggregator_processes_deployed	The number of configured data connectors.
aggregator_processes_active	The number of successfully running connectors.
aggregator_processes_failed	The number of failed connectors.

List of usable JVM Metrics

JVM Metric	Description
jvm_memory_committed_bytes	The total memory (in bytes) in the Java virtual machine runtime.
jvm_memory_used_bytes	The free memory (in bytes) in the Java virtual machine runtime.
jvm_memory_max_bytes	The total memory (in bytes) in the Java virtual machine runtime.
process_cpu_utilization	The CPU Usage (in percent) of the Java virtual machine.
process_uptime_seconds	The amount of time (in seconds) that the Java virtual machine was running.
jvm_gc_total	The number of garbage collection calls.
jvm_gc_interval_time_seconds_total	The total time (in seconds) that elapsed between garbage collection calls.
jvm_gc_interval_total	The total number of intervals between garbage collections calls. Per server Counter.
jvm_gc_duration_seconds_total	The total time consumed, in seconds in garbage collection.
jvm_threads_daemon_threads	The current number of live daemon threads.
jvm_threads_live_threads	The current number of live threads including both daemon and non-daemon threads.
jvm_threads_peak_threads	The peak live thread count since the Java virtual machine started or peak was reset.
jvm_classes_loaded_classes	The number of classes that are currently loaded in the Java virtual machine.
jvm_classes_unloaded_classes_total	The total number of classes unloaded since the Java virtual machine has started execution.
process_files_open_files	The open file descriptor count.

Overview​

Step 1: Enable Metrics​

Java System Properties​

admin.properties​

admin.properties​

Kubernetes Configuration​

info

Step 2: Get Metrics​

info

Step 3: Use Metrics​

Disable JVM Metrics​

Java System Properties​

admin.properties​

Description of Metrics​

Overview

Step 1: Enable Metrics

Java System Properties

admin.properties

admin.properties

Kubernetes Configuration

Step 2: Get Metrics

Step 3: Use Metrics

Disable JVM Metrics

Java System Properties

admin.properties

Description of Metrics