This page applies to an earlier version of the AppDynamics App IQ Platform.
For documentation on the latest version, see the 4.4 Documentation.


On this page:

Related pages:

Your Rating:
Results:
PatheticBadOKGoodOutstanding!
56 rates

To capture and present log records as analytics data, you must configure one or more log sources for the Analytics Agent. Once set up, the log source is used by the Analytics Agent to import records from the log file, structure the records according to your configuration, and send the data to the Analytics Processor. From there, the Controller presents the data in the Application Analytics UI. 

(info) Make sure you have installed and configured the components described in Installing Agent-Side Components and, for on-premise, Install the Controller and Install the Events Service. before attempting to configure Log Analytics.

Set Up Log Analytics 

The general steps to configure log analytics are: 

  1. Describe the Log Source in a Job File
  2. Reuse or Create New Grok Expressions 
  3. Verify Analytics Agent Properties

Describe the Log Source in a Job File 

Each log source is represented by a job file. A job file is a configuration file that specifies the location of the source log file, the pattern for structuring the records in the log, and other options for capturing records from the log source.

To define a source, you create a job file (or modify one of the samples) in the Analytics Agent configuration directory. The Analytics Agent include sample job files for Glassfish, OSX log, and others. 

The job files are located in the following directory:

  • <Analytics_Agent_Home>/conf/job/ 

The agent reads the job files in the directory dynamically, so you can add job files in the directory without restarting the agent.   

To configure a job file, use the following configurable settings in the file: 

  • enabled: Determines whether this log source is active. Set the value of this setting to true to have the Analytics Agent attempt to capture records from this log source.   
  • File: The location and name of the log file to serve as a log source. This must be on the local machine. 
    (info) If you are using a wildcard for the filename,  you must surround it with quotes.  For example, nameGlob: "*.log". 
  • multiline: For log file formats that may include log records that span multiple lines, configure the multiline property and indicate how the individual records in the log file can be identified. A typical example of a multiline log record is one that includes a Java exception. If the log source includes multiline records, use either of the following attributes with the multiline property to identify the lines that comprise continuation lines in a multi-line log record: 
    • startsWith: A simple prefix that matches the start of the multiline log record. For example to match this log record:

      [#|2015-09-24T06:33:31.574-0700|INFO|glassfish3.1.2|com.appdynamics.METRICS.WRITE|_ThreadID=206;_ThreadName=Thread-2;|NODE PURGER Completed in 14 ms|#]

      You could use this:

      multiline:
         startsWith: "[#|"
    • regex: A regular expression that matches the multiline log record.  For example to match this log record((from <analytics_home>conf/job/sample-postgres-log.job):

       2015-01-14 17:38:07 IST WARNING:  skipping "pg_db_role_setting" --- only superuser can vacuum it
      
      

      You could use this: 

       multiline:
          regex: "\\d{4}-\\d{2}-\\d{2}.*"

      Note: If the particular format of a multiline log file does not permit reliable continuation line matching by regular expression, you may choose to use a single line format. For most types of logs, this would result in the capture of the majority of log records.  

  • fields: The fields are used to specify the context of the log data in the Controller UI, by application name, tier name, and so on. Specify the fields as free form, key-value pairs. 
  • grok: The grok field specifies the patterns by which the data in the unstructured log record is mapped to structured analytics fields. It associates a named grok expression (as defined in a .grok file in the <Analytics_Agent_home>/conf/grok directory) to a field in the data as structured by the agent.  For example:

    grok:
      patterns: 
           - "\\[%{LOGLEVEL:logLevel}%{SPACE}\\]  \\[%{DATA:threadName}\\]  \\[%{JAVACLASS:class}\\]  %{GREEDYDATA:logMessage}"
           - "pattern 2"
           ... 

    In this case, the grok-pattern name LOGLEVEL is matched to an analytics data field named logLevel. The regular expression that is specified by the name LOGLEVEL is defined in the file grok-patterns.grok in the grok directory. For more about Grok expressions, see Reuse or Create Grok Expressions.
    (info)Previous versions of Log Analytics used a single "pattern" rather than a pattern list.  This mode is still supported for backwards compatibility.

  • eventTimestamp: This setting defines the pattern for the timestamp associated with captured data.  

Create Extracted Fields on the Fly, using the Log Analytics Controller UI

Creating complex patterns can be tricky.  Sometimes it's easier to select a sample job file that may successfully map some of the information you have in your source log files, and then use an interactive display to explore additional patterns in real time.  In these situations, you can set the Enabled flag and the File log location in a sample job file to begin collecting your logs, and then use the UI to dynamically fine-tune your fields.  This process is called creating Extracted Fields.  

  1. In the Log Analytics Controller UI, click Create New Fields.
     
  2. The Select Source Type popup appears.  Use the dropdown menu to select the log file source - based on the job file you used - with which you want to work.
     
  3. A timestamped list of log entries appears.
     
    Use the counters at the bottom of the page to move through the list, as needed.  Click Next.
  4. The Create New Fields popup appears. 
     
  5. Click Add Field to try a pattern.  Enter the pattern, using Java-based regular expressions, in the Pattern field.  The pattern includes both the name you wish for the field (in the screenshot, Field1) and the regex for the value.
    (info) Grok-based patterns are not supported. 
  6. Click Apply.  The result of the pattern is highlighted in the New Fields Preview panel below and a column with that name is populated with the discovered value.
  7. To create multiple fields, repeat the process, beginning with Add Field.
  8. When you are satisfied with the results, click Save.
  9. Each Analytics Agent in your installation periodically syncs its definitions with those created in the Controller.

Reuse or Create Grok Expressions 

Grok is a way to define and use complex, nested regular expressions in an easy to read and use format. Regexes defining discrete elements in a log file are mapped to grok-pattern names, which can also be used to create more complex patterns. Grok-pattern names for many of the common types of data found in logs are already created for you. A list of basic grok-pattern names and their underlying structures can be seen here:

  • <Analytics_Agent_Home>/conf/grok/grok-patterns.grok

The grok directory also contains samples of more complex definitions customized for various common log types - java.grok, mongodb.grok, etc.

Additional grok patterns can be seen here: https://grokdebug.herokuapp.com/patterns#

Once the grok-pattern names are created, they are then associated in the jobs file with field identifiers that become the analytics keys. The basic building block is %{grok-pattern name:identifier}, where grok-pattern name is the grok pattern that knows about the type of data in the log you want to fetch (based on a regex definition) and identifier is your identifier for the kind of data, which becomes the analytics key.  So  %{IP:client} would select an IP address in the log record and map it to the key client. 

Custom grok patterns

Complex grok patterns can be created using nested basic patterns.  For example, from the mongodb.grok file:

MONGO_LOG %{SYSLOGTIMESTAMP:timestamp} \[%{WORD:component}\] %{GREEDYDATA:message}

It is also possible to create entirely new patterns using regular expressions.  For example, the following line from java.grok defines a grok pattern named JAVACLASS.

JAVACLASS (?:[a-zA-Z$_][a-zA-Z$_0-9]*\.)*[a-zA-Z$_][a-zA-Z$_0-9]*

Because JAVACLASS is defined in a .grok file in the grok directory it can be used as if it were a basic grok pattern.  In a jobs file, you can use the JAVACLASS pattern match as follows:

grok:
  pattern: ".... \[%{JAVACLASS:class}\\]

In this case, the field name as it appears in the Application Analytics UI would be "class". For a full example, see the following files: 

  • Job file: <Analytics_Agent_Home>/conf/job/sample-analytics-log.job
  • Grok file: <Analytics_Agent_Home>/conf/grok/java.grok

Special Considerations for Backslashes

The job file is in YAML format, which treats the backslash as an escape character. Therefore, to include a literal backslash in the String pattern you need to escape the backslash with a second backslash. An easier way so you don't need to worry about using backslashes to escape characters in the .job file grok pattern, is to enclose the grok pattern in single quotes instead of double quotes such as:

grok:
  patterns:
    - '\[%{DATESTAMP:TIME}%{SPACE}CET\]%{SPACE}%{NOTSPACE:appId}%{SPACE}%{NOTSPACE:appName}%{SPACE}%{NOTSPACE:Severity}%{SPACE}%{NOTSPACE:messageId}:%{SPACE}%{GREEDYDATA:logMessage}'

Support for Numeric Fields (new in 4.1.3)

In Release 4.1.3, the grok definition syntax has been enhanced to support three basic data types. When defining a pattern in the .grok file you can specify the data type as number, boolean, or string. If a Grok alias uses that grok definition in a .job file then the extracted field is stored as a number or boolean. Strings are the default. If the number or boolean conversion fails, then a log message appears in the agent's log file. No validations are performed as it is not possible to reverse engineer a regex reliably. These are pure runtime extractions and conversions.

Upgrade pre-4.1.3 Job Files

For 4.1.2 (or older) .job files in use that have fields that are unspecified or specified as NUMBER and now switch to the ""type aware" files, the data inside Events Service will break. This is due to the type mapping. To avoid this, you need to modify the grok alias in your job files. 
Examples:

Was:
grok:
  patterns:
    - "%{DATE:happenedAt},%{NUMBER:quantity}

Update job to:
grok:
  patterns:
    - "%{DATE:happenedAt},%{NUMBER:quantity_new}
Was:
grok:
  patterns:
    - "%{DATE:happenedAt},%{DATA:howMany}

Update job to:
grok:
  patterns:
    - "%{DATE:happenedAt},%{POSINT:howManyInt}

To Upgrade (migrate) pre-4.1.3  job files: 

  1. Stop analytics-agent.
  2. Change .job files that use the enhanced grok patterns:

    BOOL:boolean
    INT:number
    BASE10NUM:number
    NUMBER:number
    POSINT:number
    NONNEGINT:number


    Change the grok alias so as not to conflict with the older aliases:

    grok:
      patterns:
    (Old)  - "%{DATE:quoteDate},%{NUMBER:open},%{NUMBER:high},%{NUMBER:low},%{NUMBER:close},%{NUMBER:volume},%{NUMBER:adjClose}"
    
    (New aliases)  - "%{DATE:quoteDate},%{NUMBER:openNum},%{NUMBER:highNum},%{NUMBER:lowNum},%{NUMBER:closeNum},%{NUMBER:volumeNum},%{NUMBER:adjCloseNum}"
  3. Start analytics-agent.

Verify Analytics Agent Properties

In addition to configuring the log source in the job file as described above, you should verify the settings in the analytics-agent.properties file in the conf directory. In the file: 

  • http.event.endpoint should be the location of the Events Service. 
  • The http.event.accountName and http.event.accessKey settings should be set to the name and the key of the account in the Controller UI with which the logs should be associated. By default, they are set to the built-in account for a single tenancy Controller.   
  • The pipeline.poll.dir setting specifies where the log configuration files are located. This would not normally be changed, unless you want to keep your files in a different location.

Troubleshoot Logs

If log capture is working correctly, logs should start appearing in the Log tab in the Analytics UI. It can take some time for logs to start accumulating. Note the following troubleshooting points:  

  • If nothing appears in the log view, try searching over the past 24 hours. 
  • Timezone discrepancies between the logs and local machine can cause log entries to be incorrectly excluded based on the selected timeframe in the Controller UI. To remediate, try setting the log files and system time to UTC or logging the timezone with the log message to verify. 
  • An inherent delay in indexing may result in the "last minute" view in the UI consistently yielding no logs. Increase the time range if you encounter this issue.

Troubleshoot Patterns

To help you troubleshoot the data extraction patterns in your job file, you can use the two debug REST endpoints in the Analytics Agent:

(info)  In the following examples, the Analytics Agent host is assumed to be localhost and the Analytics Agent port is assumed to be 9090.  To configure the port on your Agent, use  the property ad.dw.http.port in <Analytics_Agent_Home>/conf/analytics-agent.properties.

The Grok Endpoint

 Click here to expand...

The Grok tool works in two modes: extraction from a single line  log and extraction from a multi-line log.  To get a description of usage options:

curl -X GET http://localhost:9090/debug/grok 

Single Line

In this mode you pass in (as a POST request) a sample line from your log and the grok pattern you are testing, and you receive back the data you passed in organized as key/value pairs, where the keys are your identifiers.

curl -X POST http://localhost:9090/debug/grok --data-urlencode "logLine=LOG_LINE" --data-urlencode "pattern=PATTERN"

 For example, the input:

curl -X POST http://localhost:9090/debug/grok --data-urlencode "logLine=[2014-09-04T15:22:41,594Z]  [INFO ]  [main]  [o.e.j.server.handler.ContextHandler]  Started i.d.j.MutableServletContextHandler@2b3b527{/,null,AVAILABLE}" --data-urlencode "pattern=\\[%{LOGLEVEL:logLevel}%{SPACE}\\]  \\[%{DATA:threadName}\\]  \\[%{JAVACLASS:class}\\]  %{GREEDYDATA:logMessage}"

would produce this output:

{
 threadName => main
 logLevel => INFO
 class => o.e.j.server.handler.ContextHandler
 logMessage => Started i.d.j.MutableServletContextHandler@2b3b527{/,null,AVAILABLE}
}

The input:

curl -X POST http://localhost:9090/debug/grok --data-urlencode "logLine=2010-05-05,500.98,515.72,500.47,509.76,4566900,509.76" --data-urlencode "pattern=%{DATE:quoteDate},%{DATA:open},%{DATA:high},%{DATA:low},%{DATA:close},%{DATA:volume},%{GREEDYDATA:adjClose}"

would produce this output:

{
 open => 500.98
 adjClose => 509.76
 volume => 4566900
 quoteDate => 10-05-05
 high => 515.72
 low => 500.47
 close => 509.76
}

Multi-line

The multi-line version uses a file stored on the local filesystem as the source input.

curl -X POST http://localhost:9090/debug/grok --data-urlencode "logLine=`cat FILE_NAME`" --data-urlencode "pattern=PATTERN"

where FILE_NAME is the full path filename of the file that contains the multi-line log.

The Timestamp Endpoint

 Click here to expand...

The timestamp tool extracts the timestamp from a log line.

To get a description of usage options: 

curl -X GET http://localhost:9090/debug/timestamp

In this mode you pass in (as a POST request) a sample line from your log and the timestamp pattern you are testing, and you receive back the timestamp contained within the log line.

curl -X POST http://localhost:9090/debug/timestamp --data-urlencode "logLine=LOG_LINE" --data-urlencode "pattern=PATTERN"

For example, the input:

curl -X POST http://localhost:9090/debug/timestamp --data-urlencode "logLine=[2014-09-04T15:22:41,237Z]  [INFO ]  [main]  [io.dropwizard.server.ServerFactory]  Starting DemoMain" --data-urlencode "pattern=yyyy-MM-dd'T'HH:mm:ss,SSSZ"

would produce this output:

{
 eventTimestamp => 2014-09-04T15:22:41.237Z
}

The input:

curl -X POST http://localhost:9090/debug/timestamp --data-urlencode "logLine=Nov 17, 2014 8:21:51 AM com.foo.blitz.processor.core.hbase.coprocessor.endpoint.TimeRollupProcessEndpoint$HBaseDataFetcher callFoo1" --data-urlencode "pattern=MMM d, yyyy h:mm:ss aa"

would produce this output:

{
 eventTimestamp => 2014-11-17T16:21:51.000Z
}