Introduction

During execution, Gateway and Apex generate event log records that provide an audit trail. This can be used to understand the activity of the system and to diagnose problems. Usually, the event log records are stored into a local file system and later can be used for analysis and diagnostic.

Gateway also provides an universal ability to pass and store Gateway and Apex event log records to 3rd party sources. You can use external tools to store the log events and also to query and report. For this you must configure the logger appender in the Gateway configuration files.

Configuring Logger Appenders

Gateway and Apex Client processes run on the machine node where the Gateway instance has been installed. Therefore, you can configure the logger appenders using the regular log4j properties (datatorrent/releases/3.9.0/conf/dtgateway.log4j.properties).

Following is an example of log4j properties configuration for Socket Appender:

log4j.rootLogger=${dt.root.logger.level},tcp

log4j.appender.tcp=org.apache.log4j.net.SocketAppender
log4j.appender.tcp.RemoteHost=logstashnode1
log4j.appender.tcp.Port=5400
log4j.appender.tcp.ReconnectionDelay=10000
log4j.appender.tcp.LocationInfo=true

You can use the regular attribute property “apex.attr.LOGGER_APPENDER” to configure the logger appenders for Apex Application Master and Containers. This can be defined in the configuration file dt-site.xml (global, local, and user) or in the static and runtime application properties.

Use the following syntax to enter the logger appender attribute value:

{comma-separated-appender-names};{comma-separated-appenders-properties}

Following is an example of logger appender attribute configuration for Socket Appender:

  <property>
    <name>apex.attr.LOGGER_APPENDER</name>
    <value>tcp;log4j.appender.tcp=org.apache.log4j.net.SocketAppender,
           log4j.appender.tcp.RemoteHost=logstashnode1,
           log4j.appender.tcp.Port=5400,
           log4j.appender.tcp.ReconnectionDelay=10000,
           log4j.appender.tcp.LocationInfo=true
    </value>
  </property>

Integrating with ElasticSearch and Splunk

You can use different methods to store event log records to an external data source. However, we recommend to use the following method:

Gateway and Apex can be configured to use Socket Appender to send logger events to Logstash and Logstash can deploy event log records to any output data sources. For instance, the following picture shows the integration workflow with ElasticSearch and Splunk.

Following is an example of Logstash configuration:

input {  getting of looger events from Socket Appender
  log4j {
    mode => "server"
    port => 5400
    type => "log4j"
  }
}

Filter{  transformation of looger events to event log records
  mutate {
    remove_field => [ "@version","path","tags","host","type","logger_name" ]
    rename => { "apex.user" => "user" }
    rename => { "apex.application" => "application" }
    rename => { "apex.containerId" => "containerId" }
    rename => { "apex.applicationId" => "applicationId" }
    rename => { "apex.node" => "node" }
    rename => { "apex.service" => "service" }
    rename => { "dt.node" => "node" }
    rename => { "dt.service" => "service" }
    rename => { "priority" => "level" }
    rename => { "timestamp" => "recordTime" }
   }
   date {
    match => [ "recordTime", "UNIX" ]
    target => "recordTime"
  }
}

output {
  elasticsearch {  putting of event log records to ElasticSearch cluster
  hosts => ["esnode1:9200","esnode2:9200","esnode3:9200"]
    index => "apexlogs-%{+YYYY-MM-dd}"
    manage_template => false
  }

  tcp {  putting of event log records to Splunk
   host => "splunknode"
   mode => "client"
   port => 15000
   codec => "json_lines"
 }
}

ElasticSearch users can use Kibana reporting tool for analysis and diagnostic. Splunk users can use Splunkweb.

Links to 3rd party tools:

Logstash: https://www.elastic.co/products/logstash
ElasticSearch: https://www.elastic.co/products/elasticsearch
Kibana: https://www.elastic.co/products/kibana
Splunk: https://www.splunk.com