Infrastructure Monitoring with OpenTelemetry

Example using New Relic

August 06, 2022 · 14 mins read

Introduction

We’ll be using the OpenTelemetry Collector otelcol-contrib to gather host metrics on Windows, Linux, and Mac. At the time of this writing, we’ll be using the latest otelcol-contrib package version 0.57.2.

Windows

The following steps assumes you are running Windows PowerShell as an Administrator.

  1. Create a directory somewhere to keep the installation in one place.
     mkdir "C:\Program Files\OpenTelemetry"
    

    Screenshot

  2. Change to the newly created directory.
     cd "C:\Program Files\OpenTelemetry"
    

    Screenshot

  3. Download the otelcol-contrib package from GitHub.
     (New-Object Net.WebClient).DownloadFile("https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.57.2/otelcol-contrib_0.57.2_windows_amd64.tar.gz", "C:\Program Files\OpenTelemetry\otelcol-contrib_0.57.2_windows_amd64.tar.gz")
    

    Screenshot

  4. Extract the files to the current directory, either using the command line or 7zip should work.
     tar -xf otelcol-contrib_0.57.2_windows_amd64.tar.gz
    

    Screenshot

  5. Download a sample configuration file for the OpenTelemetry Contrib collector from GitHub
     (New-Object Net.WebClient).DownloadFile("https://gist.githubusercontent.com/pnvnd/1533b04609fbbe583056afdc31683667/raw/242f9fa682989639a06fc22eb209b8ffd1a4d4c3/otel-config_windows.yaml", "C:\Program Files\OpenTelemetry\otel-config_windows.yaml")
    

    Otherwise, you can create otel-config_windows.yaml manually with the following:

     extensions:
       health_check:
    
     receivers:
       hostmetrics:
         collection_interval: 20s
         scrapers:
           cpu:
             metrics:
               system.cpu.utilization:
                 enabled: true
           load:
           memory:
             metrics:
               system.memory.utilization:
                 enabled: true
           disk:
           filesystem:
             metrics:
               system.filesystem.utilization:
                 enabled: true
           network:
           paging:
             metrics:
               system.paging.utilization:
                 enabled: true
     #     processes:
           process:
             mute_process_name_error: true
    
     processors:
       memory_limiter:
         check_interval: 1s
         limit_mib: 1000
         spike_limit_mib: 200
       batch:
       cumulativetodelta:
         include:
           metrics:
             - system.network.io
             - system.disk.operations
             - system.network.dropped
             - system.network.packets
             - process.cpu.time
           match_type: strict
       resource:
         attributes:
           - key: host.id
             from_attribute: host.name
             action: upsert
       resourcedetection:
         detectors: [env, system]
    
     exporters:
       otlp:
         endpoint: https://otlp.nr-data.net:4317
         headers:
           api-key: YOUR_KEY_HERE
    
     service:
       pipelines:
         metrics:
           receivers: [hostmetrics]
           processors: [batch, resourcedetection, resource, cumulativetodelta]
           exporters: [otlp]
    

    Note: The processes is commented out for the Windows configuration, since this is only supported in Linux.

  6. Edit the otel-config_windows.yaml file and replace YOUR_KEY_HERE with your New Relic Ingest - License key ending with NRAL
    ((Get-Content -path "C:\Program Files\OpenTelemetry\otel-config_windows.yaml" -Raw) -replace 'YOUR_KEY_HERE','XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXNRAL') | Set-Content -Path "C:\Program Files\OpenTelemetry\otel-config_windows.yaml"
    

    Screenshot

  7. Create a Windows service to run the OpenTelemetry Collector in the background
    New-Service -Name "otel-infra" -BinaryPathName 'C:\Program Files\OpenTelemetry\otelcol-contrib.exe --config=file:"C:\Program Files\OpenTelemetry\otel-config_windows.yaml"' -DisplayName "OpenTelemetry Infrastructure Agent" -StartupType Manual -Description "OpenTelemetry Infrastructure Agent v0.57.2" -Credential "DESKTOP-IED569R\Peter"
    

    Screenshot

    Note: Running this service as a Local System account does not have access to gather storage metrics, so you’ll need to create a service account or use your current account credentials to get these metrics.
    Screenshot

  8. Start the otel-infra service by running services.msc or use the command line
    net start "otel-infra"
    

    Screenshot Screenshot

  9. Log into New Relic and check your Hosts to see metrics. Screenshot

  10. Stop the otel-infra service with
    net stop "otel-infra"
    
  11. If you need to uninstall the service and remove the files, use this command
    "sc delete otel-infra" | cmd
    cd ~
    rm -r "C:\Program Files\OpenTelemetry"
    

    Screenshot

Linux

Similar to the Windows setup, commands will be listed here to do the same thing, but with a Raspberry Pi 4B device. Screenshot

  1. Make a directory somewhere to save the files and configurations
     sudo mkdir /opt/opentelemetry
    
  2. Change to the newly created directory
     cd /opt/opentelemetry
    
  3. Download the otelcol-contrib package for Linux. In this example, we’re running this on a Raspberry Pi 4B (8GB) with Raspbian (64-Bit), so we’re going to get the arm64 version.
     sudo wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.57.2/otelcol-contrib_0.57.2_linux_arm64.tar.gz
    
  4. Download a sample configuration file for Linux
     sudo wget https://gist.githubusercontent.com/pnvnd/1533b04609fbbe583056afdc31683667/raw/93c55d4065bc783d3f340931cca91fceacccaae8/otel-config_linux.yaml
    

    Otherwise, create otel-config_linux.yaml manually with the following

     extensions:
       health_check:
    
     receivers:
       hostmetrics:
         collection_interval: 20s
         scrapers:
           cpu:
             metrics:
               system.cpu.utilization:
                 enabled: true
           load:
           memory:
             metrics:
               system.memory.utilization:
                 enabled: true
           disk:
           filesystem:
             metrics:
               system.filesystem.utilization:
                 enabled: true
           network:
           paging:
             metrics:
               system.paging.utilization:
                 enabled: true
           processes:
           process:
             mute_process_name_error: true
    
     processors:
       memory_limiter:
         check_interval: 1s
         limit_mib: 1000
         spike_limit_mib: 200
       batch:
       cumulativetodelta:
         include:
           metrics:
             - system.network.io
             - system.disk.operations
             - system.network.dropped
             - system.network.packets
             - process.cpu.time
           match_type: strict
       resource:
         attributes:
           - key: host.id
             from_attribute: host.name
             action: upsert
       resourcedetection:
         detectors: [env, system]
    
     exporters:
       otlp:
         endpoint: https://otlp.nr-data.net:4317
         headers:
           api-key: YOUR_KEY_HERE
    
     service:
       pipelines:
         metrics:
           receivers: [hostmetrics]
           processors: [batch, resourcedetection, resource, cumulativetodelta]
           exporters: [otlp]
    
  5. Extract the files
     sudo tar -xf otelcol-contrib_0.57.2_linux_arm64.tar.gz
    
  6. Check the version of your otelcol-contrib package
     sudo ./otelcol-contrib --version
    
  7. Edit otel-config_linux.yaml and update with your New Relic Ingest - License key
     sudo nano otel-config_linux.yaml
    

    Otherwise run this command

     sudo sed -i 's/YOUR_KEY_HERE/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXNRAL/' otel-config_linux.yaml
    
  8. Start your OpenTelemetry Collector in background
     sudo nohup ./otelcol-contrib --config=file:"/home/pi/otel/otel-config_linux.yaml" &
    
  9. Log into New Relic and check your hosts for metrics Screenshot

  10. To check if you are running the otelcol-contrib run
     suo ps aux | grep otelcol-contrib
    
  11. Stop the otelcol-contrib service from running using this command
     sudo pkill otelcol-contrib
    

Mac

More support is being added to Mac devices, so at the time of this writing, cpu and some disk/storage metrics are not captured by the otelcol-contrib package.

  1. Download the darwin otelcol-contrib package for your MacOS - amd64 for Intel Macs or arm64 newer Macs. IN this example, I’ll be using https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.57.2/otelcol-contrib_0.57.2_darwin_amd64.tar.gz

  2. Once downloaded, double-click on the tar.gz package to extract the files. Screenshot

  3. Create a folder somewhere, /Users/yourname/Desktop/otel-infra should be fine. Screenshot

  4. Copy or move the otelcol-contrib package to the otel-infra folder. Then, right-click > Open. Screenshot

  5. Click Open. While nothing should happen, we do this to allow your Mac permissions to run the OpenTelemetry Collector in a terminal. Screenshot

  6. Download a configuration file for otelcol-contrib here:
    https://gist.githubusercontent.com/pnvnd/1533b04609fbbe583056afdc31683667/raw/d34ca012a2035b5d6636fa6867247f5418cc3322/otel-config_mac.yaml
    

    Otherwise, create the file manually with the following:

     extensions:
       health_check:
    
     receivers:
       hostmetrics:
         collection_interval: 20s
         scrapers:
     #     cpu:
     #       metrics:
     #         system.cpu.utilization:
     #           enabled: true
           load:
           memory:
             metrics:
               system.memory.utilization:
                 enabled: true
     #     disk:
           filesystem:
             metrics:
               system.filesystem.utilization:
                 enabled: false
           network:
           paging:
             metrics:
               system.paging.utilization:
                 enabled: true
           processes:
     #     process:
    
     processors:
       memory_limiter:
         check_interval: 1s
         limit_mib: 1000
         spike_limit_mib: 200
       batch:
       cumulativetodelta:
         include:
           metrics:
             - system.network.io
             - system.disk.operations
             - system.network.dropped
             - system.network.packets
             - process.cpu.time
           match_type: strict
       resource:
         attributes:
           - key: host.id
             from_attribute: host.name
             action: upsert
       resourcedetection:
         detectors: [env, system]
    
     exporters:
       otlp:
         endpoint: https://otlp.nr-data.net:4317
         headers:
           api-key: YOUR_KEY_HERE
    
     service:
       pipelines:
         metrics:
           receivers: [hostmetrics]
           processors: [batch, resourcedetection, resource, cumulativetodelta]
           exporters: [otlp]
    

    Notice that cpu and disk have been commented out, as they have not been implemented in this version. However, if you try anyway, you may get a few storage metrics. Screenshot

  7. Edit otel-config_mac.yaml and replace YOUR_KEY_HERE with your Ingest - License key from New Relic.

  8. Run this command to start the OpenTelemetry Collector:
    ./otelcol-contrib --config=file:./otel-config_mac.yaml
    

    Screenshot

  9. Log into New Relic and check the results. Screenshot

Remarks

Overall, collecting host metrics with the OpenTelemtry Collector Contrib package is still beta, and breaking changes can happen in future releases. However, the progress so far is quite impressive. Feel free to check out the Host Metrics Reciever Readme for updates.