Monitoring - Using Java Agent

In our November 2024 release we have added an option to enable Java Agent to monitor the dashboard to collect metrics, traces and logs for observability.

You can find info about enabling this feature by checking out our November 2024 release notes - November 2024 - Dashboard Release Notes

Example:

This configuration bellow will do the following:

Launch the Dashboard with Java Agent Enabled: The application is started with the OpenTelemetry Java agent, which uses the OTLP exporter configured. This setup will allow the application to send telemetry data (traces and metrics in this example) to the OpenTelemetry Collector for further processing.
Collect and Process Telemetry Data: The OpenTelemetry Collector will be set up to receive telemetry data from the application via OTLP protocols (HTTP on port 4318 and gRPC on port 4317).
Export Metrics: The processed metrics will then exposed for Prometheus to scrape at the endpoint 0.0.0.0:8889. This integration will allows real-time monitoring and visualisation of the dashboard’s performance.
Export Traces: The OpenTelemetry Collector will forward the traces to Zipkin, a distributed tracing system. Zipkin will visualise these traces, helping track the flow of requests across the dashboard and identify any performance bottlenecks or failures.

First we need to define the configuration file for OpenTelemetry Collector. Example:

config.yml

receivers:
  otlp: # Defines the OTLP receiver to accept telemetry data (traces/metrics)
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
processors:
  batch/traces:
    timeout: 1s 
    send_batch_size: 50  
  batch/metrics:
    timeout: 1s
    send_batch_size: 50  
    
exporters:
  prometheus: # Defines an exporter for Prometheus to expose telemetry data
    endpoint: 0.0.0.0:8889 # The network address and port where Prometheus can scrape metrics
  zipkin:
    endpoint: "http://zipkin:9411/api/v2/spans"
service:
  pipelines:    
    metrics:  
      receivers: [otlp] 
      processors: [batch/metrics] 
      exporters: [prometheus] 
    traces:  
      receivers: [otlp]
      processors: [batch/traces]
      exporters: [zipkin]

We then define a configuration file for Prometheus. Example:

prometheus.yml

global:
  scrape_interval:     15s # Default time interval between scrapes for all jobs unless overridden
  evaluation_interval: 15s # Default time interval to evaluate alerting and recording rules 

scrape_configs: # List of scrape configurations defining how Prometheus should scrape metrics from different targets
  - job_name: 'example'
    metrics_path: '/metrics' # Endpoint where metrics are exposed for the target
    scrape_interval: 5s # Override the global `scrape_interval` to scrape this job every 5 seconds
    static_configs:
      - targets: ['otel-collector:8889'] # The address of the target to scrape

We then put everything together in one docker compose file. This will include:

Latest version of the dashboard with environment variables defined that will enable Java Agent and will have endpoint and service name specified
External db
OpenTelemetry Collector
Prometheus for scraping metrics
Zipkin for tracing

docker-compose.yml

services:
  dashboard:
    image: ${LATEST_DASHBOARD_IMAGE}
    ports:
      - 8226:8226
    environment:
      PI_DB_HOST: database
      PI_DB_PASSWORD: password
      PI_DB_USERNAME: root
      PI_DB_SCHEMA_NAME: dashboard
      PI_DB_PORT: 3306
      PI_EXTERNAL_DB: "true"
      PI_LICENCE: ${LICENCE_KEY}
      PI_TOMCAT_MONITORING_ENABLE_JAVA_AGENT: "true"
      PI_TOMCAT_MONITORING_OTLP_EXPORTER_ENDPOINT: "http://otel-collector:4318"
      PI_TOMCAT_MONITORING_OTLP_EXPORTER_PROTOCOL: "http/protobuf"
      PI_TOMCAT_MONITORING_SERVICE_NAME: "pi-dashboard"
      PI_TOMCAT_PORT: 8226
    healthcheck:
      test: [ "CMD", "/bin/bash", "/var/panintelligence/tomcat_healthcheck.sh" ]
      interval: 10s
      start_period: 60s
      retries: 3
  database:
    image: mariadb:10.9.4
    environment:
      MARIADB_DATABASE: dashboard
      MARIADB_ROOT_PASSWORD: password
      LANG: C.UTF-8
    command: --lower_case_table_names=1 --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
    restart: always
    ports:
      - "3306:3306"
  otel-collector:
    image: otel/opentelemetry-collector
    container_name: otel-collector
    command: [ "--config=/etc/config.yml" ]
    volumes:
      - /path_to_config_files/config.yml:/etc/config.yml # Mounts the local collector configuration file into the container at the specified path
    ports:
      - "4317:4317" # OTLP gRPC protocol for receiving telemetry data
      - "4318:4318" # MOTLP HTTP protocol for receiving telemetry data
      - "8889:8889" # Maps port 8889 for Prometheus scraping metrics from the collector
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    ports:
      - "9090:9090" # Maps port 9090 for accessing the Prometheus web interface
    volumes:
      - /path_to_prometheus_config/prometheus.yml:/etc/prometheus/prometheus.yml # Mounts the local Prometheus configuration file into the container
  zipkin:
    image: openzipkin/zipkin:2.23
    container_name: zipkin
    ports:
      - "9411:9411"

Things to note:

All three files (config.yml, prometheus.yml and docker-compose.yml) should exist for this example to work successfully
Amend volumes for otel-collector and prometheus - make sure to point to the correct path within your local directory
Make sure all ports specified are available

After running docker compose up, navigate to the dashboard and log in - make sure all is working nicely.

Navigate to http://localhost:8889/metrics - you will see some output from prometheus of the metrics that have been scraped. Example:

Navigate to zipkink URL (http://localhost:9411/zipkin/), click on ‘run query’. This is the output as an example:

Screenshot from 2024-10-11 12-12-26.png

Screenshot from 2024-10-11 12-14-56.png