Skip to main content

System Monitoring

Overview

There are common reasons for system monitoring:

  • Performance optimization: monitoring of CPU and memory usage can help identify and tackle important performance issues. As an example, an abnormally high CPU may be a signal for a hardware upgrade or a software optimization.
  • Resource management: analyze your system demands to plan upscales and performance spikes.
  • System maintenance: get actionable data in time to ensure the system meets users requirements.
  • Security: monitoring of network metrics can help identify and eliminate potential security threats.

Aggregator can be configured to enable application and standard JVM metrics to monitor your system - refer to Description of Metrics to learn more. You can use Prometheus to collect and store, and Grafana to visualize them.

Step 1: Enable Metrics

To enable application and JVM metrics, configure Aggregator via Java System Properties or Aggregator admin.properties configuration file. When configured, you can access metrics via a special endpoint /agg/api/metrics (ex. http://localhost:port/agg/api/metrics) in the format compatible with Prometheus.

Java System Properties

To enable system metrics, add these variables to your docker or docker-compose:

# docker-compose.yml example

environment:
- JAVA_OPTS=
-DAggregator.metrics.enable=true # enable metrics for Aggregator
-DAggregator.metrics.enableJvmMetrics=true # enable JVM metrics (disabled by default)
-DAggregator.metrics.enableTomcatMetrics=true # enable Tomcat metrics (disabled by default)

admin.properties

To enable system metrics, specify the following parameters in admin.properties configuration file:

   Aggregator.metrics.enable=true               # enable metrics for Aggregator
Aggregator.metrics.enableJvmMetrics=true # enable JVM metrics (disabled by default)
Aggregator.metrics.enableTomcatMetrics=true # enable Tomcat metrics (disabled by default)

Kubernetes Configuration

To enable all types of metrics in Kubernetes:

Enable serviceMonitor in Aggregator values.yaml. Note, that this service is enabled by default.

# include this in Aggregator values.yaml to enable system metrics monitoring
serviceMonitor:
enabled: true
namespace: monitoring
interval: "30s"
labels:
monitoring: application
info

Refer to Deployment in Kubernetes to learn more.

Step 2: Get Metrics

You can use Prometheus to collect and store system metrics. To periodically scrape metrics from Aggregator, add the Aggregator endpoint /agg/api/metrics to metrics_path section in Prometheus configuration file:

# Example of configuration for Prometheus

global:
scrape_interval: 1s # By default, scrape targets every 15 seconds.

# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'external-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'Aggregator'

metrics_path: /agg/api/metrics

# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 1s

static_configs:
- targets: ['localhost:8011']
info

To learn how to install and configure Prometheus, refer to Prometheus Installation and Prometheus Getting Started.

Step 3: Use Metrics

You can visualize the collected system metrics in Grafana dashboards:

  1. Install Grafana

  2. Install and configure Prometheus data source

  3. Add and configure Grafana dashboards

This is a sample illustration of the JVM Actuator Grafana dashboard with Aggregator JVM metrics:

Disable JVM Metrics

There are two ways to disable JVM metrics, depending on the method you used to enable them:

Java System Properties

To disable JVM metrics, add these variables to your docker or docker-compose file:

# docker-compose.yml example

environment:
- JAVA_OPTS=
-DAggregator.metrics.disableJvmMetrics=true # enable or disable JVM metrics

admin.properties

To disable JVM metrics, specify the following parameters in admin.properties configuration file:

Aggregator.metricsService=QSMetricsServiceInfo
Aggregator.metricsService.disableJvmMetrics=true # enable or disable JVM metrics

Description of Metrics

Aggregator metrics
Aggregator Application MetricDescription
aggregator_processes_deployedThe number of configured data connectors.
aggregator_processes_activeThe number of successfully running connectors.
aggregator_processes_failedThe number of failed connectors.
List of usable JVM Metrics
JVM MetricDescription
jvm_memory_committed_bytesThe total memory (in bytes) in the Java virtual machine runtime.
jvm_memory_used_bytesThe free memory (in bytes) in the Java virtual machine runtime.
jvm_memory_max_bytesThe total memory (in bytes) in the Java virtual machine runtime.
process_cpu_utilizationThe CPU Usage (in percent) of the Java virtual machine.
process_uptime_secondsThe amount of time (in seconds) that the Java virtual machine was running.
jvm_gc_totalThe number of garbage collection calls.
jvm_gc_interval_time_seconds_totalThe total time (in seconds) that elapsed between garbage collection calls.
jvm_gc_interval_totalThe total number of intervals between garbage collections calls. Per server Counter.
jvm_gc_duration_seconds_totalThe total time consumed, in seconds in garbage collection.
jvm_threads_daemon_threadsThe current number of live daemon threads.
jvm_threads_live_threadsThe current number of live threads including both daemon and non-daemon threads.
jvm_threads_peak_threadsThe peak live thread count since the Java virtual machine started or peak was reset.
jvm_classes_loaded_classesThe number of classes that are currently loaded in the Java virtual machine.
jvm_classes_unloaded_classes_totalThe total number of classes unloaded since the Java virtual machine has started execution.
process_files_open_filesThe open file descriptor count.