Monitoring - Using Java Agent

In our November 2024 release we have added an option to enable Java Agent to monitor the dashboard to collect metrics, traces and logs for observability.

 

Example:

This configuration bellow will do the following:

  1. Launch the Dashboard with Java Agent Enabled: The application is started with the OpenTelemetry Java agent, which uses the OTLP exporter configured. This setup will allow the application to send telemetry data (traces and metrics in this example) to the OpenTelemetry Collector for further processing.

  2. Collect and Process Telemetry Data: The OpenTelemetry Collector will be set up to receive telemetry data from the application via OTLP protocols (HTTP on port 4318 and gRPC on port 4317).

  3. Export Metrics: The processed metrics will then exposed for Prometheus to scrape at the endpoint 0.0.0.0:8889. This integration will allows real-time monitoring and visualisation of the dashboard’s performance.

  4. Export Traces: The OpenTelemetry Collector will forward the traces to Zipkin, a distributed tracing system. Zipkin will visualise these traces, helping track the flow of requests across the dashboard and identify any performance bottlenecks or failures.

First we need to define the configuration file for OpenTelemetry Collector. Example:

config.yml

receivers: otlp: # Defines the OTLP receiver to accept telemetry data (traces/metrics) protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 processors: batch/traces: timeout: 1s send_batch_size: 50 batch/metrics: timeout: 1s send_batch_size: 50 exporters: prometheus: endpoint: 0.0.0.0:8889 # The network address and port where Prometheus can scrape metrics zipkin: endpoint: "http://zipkin:9411/api/v2/spans" # Traces are exported to Zipkin at defined url service: pipelines: metrics: receivers: [otlp] processors: [batch/metrics] exporters: [prometheus] traces: receivers: [otlp] processors: [batch/traces] exporters: [zipkin]

 

We then define a configuration file for Prometheus. Example:

prometheus.yml

global: scrape_interval: 15s # Default time interval between scrapes for all jobs unless overridden evaluation_interval: 15s # Default time interval to evaluate alerting and recording rules scrape_configs: - job_name: 'example' metrics_path: '/metrics' # Endpoint where metrics are exposed for the target scrape_interval: 5s # Override the global `scrape_interval` to scrape this job every 5 seconds static_configs: - targets: ['otel-collector:8889'] # The address of the target to scrape

 

We then put everything together in one docker compose file. This will include:

  • Latest version of the dashboard with environment variables defined that will enable Java Agent and will have endpoint and service name specified

  • External db

  • OpenTelemetry Collector

  • Prometheus for scraping metrics

  • Zipkin for tracing

docker-compose.yml

services: dashboard: image: ${LATEST_DASHBOARD_IMAGE} ports: - 8226:8226 environment: PI_DB_HOST: database PI_DB_PASSWORD: password PI_DB_USERNAME: root PI_DB_SCHEMA_NAME: dashboard PI_DB_PORT: 3306 PI_EXTERNAL_DB: "true" PI_LICENCE: ${LICENCE_KEY} PI_TOMCAT_MONITORING_ENABLED: "true" PI_TOMCAT_MONITORING_OTLP_EXPORTER_ENDPOINT: "http://otel-collector:4318" PI_TOMCAT_MONITORING_OTLP_EXPORTER_PROTOCOL: "http/protobuf" PI_TOMCAT_MONITORING_SERVICE_NAME: "pi-dashboard" PI_TOMCAT_PORT: 8226 healthcheck: test: [ "CMD", "/bin/bash", "/var/panintelligence/tomcat_healthcheck.sh" ] interval: 10s start_period: 60s retries: 3 database: image: mariadb:10.9.4 environment: MARIADB_DATABASE: dashboard MARIADB_ROOT_PASSWORD: password LANG: C.UTF-8 command: --lower_case_table_names=1 --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci restart: always ports: - "3306:3306" otel-collector: image: otel/opentelemetry-collector container_name: otel-collector command: [ "--config=/etc/config.yml" ] volumes: - /path_to_config_files/config.yml:/etc/config.yml # Mounts the local collector configuration file into the container at the specified path ports: - "4317:4317" # OTLP gRPC protocol for receiving telemetry data - "4318:4318" # MOTLP HTTP protocol for receiving telemetry data - "8889:8889" # Maps port 8889 for Prometheus scraping metrics from the collector prometheus: image: prom/prometheus:latest container_name: prometheus ports: - "9090:9090" # Maps port 9090 for accessing the Prometheus web interface volumes: - /path_to_prometheus_config/prometheus.yml:/etc/prometheus/prometheus.yml # Mounts the local Prometheus configuration file into the container zipkin: image: openzipkin/zipkin:2.23 container_name: zipkin ports: - "9411:9411"

Things to note:

  1. All three files (config.yml, prometheus.yml and docker-compose.yml) should exist for this example to work successfully

  2. Amend volumes for otel-collector and prometheus - make sure to point to the correct path within your local directory

  3. Make sure all ports specified are available

After running docker compose up, navigate to the dashboard and log in - make sure all is working nicely.

Navigate to http://localhost:8889/metrics - you will see some output from prometheus of the metrics that have been scraped. Example:

 

pi1.png
p2.png

 

Navigate to zipkink URL (http://localhost:9411/zipkin/), click on ‘run query’. This is the output as an example: