Download PDF
Download page Analyzed Fields.
Analyzed Fields
Selected fields in the analytics data are "analyzed". Analysis consists of tokenizing a block of text into individual terms suitable for use in an inverted index and normalizing these terms into a standard form to improve their searchability. The ADQL Comparison Operators behave differently for analyzed and non-analyzed fields. To construct queries on analyzed fields, you need to understand some concepts about how the tokens are built.
Uppercase letters in analyzed fields are all converted to lowercase to build tokens.
Delimiters and other factors (such as CamelCase terms) affect how strings are tokenized. For a string such as "myname@company.com", "@" is a delimiter, therefore myname and company.com are two separate tokens. Additionally, company.com is two separate tokens. Note that all non-alphanumeric characters are delimiters. A term that uses CamelCase, such as VicePresident
, is tokenized into separate tokens based on recognition of the CamelCase nature of the term resulting in the tokens: Vice
and President
as well as VicePresident
.
For an example string such as: <VicePresident:SalesAndMarketing> - EMEAAustraliaUSA94107
, the tokens generated include the following:
- vicepresident
- vice
- president
- salesandmarketing
- sales
- and
- marketing
- emeaaustraliausa94107
- emeaaustraliausa
- emea
- australia
- usa
- 94107
Analytics Analyzed Fields
The analyzed fields in analytics events are:
- Logs: Message
- Transactions: Errors and Error Detail
- Mobile: stacktrace
Queries on Analyzed Fields
Full-text search is supported on analyzed fields, including the message field for logs, using the LIKE operator. See Comparison Operators.
On analyzed fields, the REGEXP operator matches exactly only the analyzed and processed tokens, so you cannot query across the complete message.
Consider this log message:
[2016-06-09 06:07:50,118] [INFO ] [org.springframework.jms.listener.DefaultMessageListenerContainer#0-1] [com.appdynamics.provision.OrderMessageListener] [AD_REQUEST_GUID[2b8ce807-986c-45f9-a5b4-2fe5a6fd90f3]] Received a message to process the order Order_3138 for the user myname@company.com
In this log message, myname and company.com are two separate tokens because "@" is a delimiter. To search a log message like this for results based on the email address requires searching across tokens.
A query such as the following using REGEXP, will fail because myname and company.com are two separate tokens.
SELECT FROM logs WHERE appName = 'yourAppName' AND sourceType = 'yourLogFile' AND message REGEXP ‘myname@company.*
The LIKE operator is not affected by delimiters, so an alternative query using LIKE operator is a better choice.
SELECT FROM logs WHERE appName = 'yourAppName' AND sourceType = 'yourLogFile' AND message LIKE ‘myname@company'
You can also use wildcards in the query because they work across tokens.
SELECT FROM logs WHERE appName = 'yourAppName' AND sourceType = 'yourLogFile' AND message LIKE ‘myname@company*'
Example Queries for Analyzed Fields
Searching the following analytics log events:
Jun 7 11:27:45 appd sshd[1032]: Illegal user test from 110.49.183.11 Jun 7 11:27:46 appd sshd[1032]: Failed password for illegal user test from 110.49.183.11 port 9218 ssh2 Jun 7 11:27:46 appd sshd[1032]: error: Could not get shadow information for NOUSER
Note the results from the following queries:
Query | Results |
---|---|
SELECT * FROM logs WHERE sourceType='yourLogFile' AND message REGEXP 'illegal.+user' | Does not match any log events in the sample because the query string spans across multiple tokens. Use LIKE for an instance like this. |
SELECT * FROM logs WHERE sourceType='yourLogFile' AND message REGEXP 'illegal.*' | Matches the first two log events |
SELECT * FROM logs WHERE sourceType='yourLogFile' AND message REGEXP 'Failed*' | Does not match any log events in the sample because the token has only lowercase ‘failed' |
To search for string : javaIOException
Query |
---|
SELECT * FROM transactions WHERE application = 'yourApp' AND segments.errorList.errorCode REGEXP 'java[a-z][a-z][a-z][a-z][a-z][a-z][a-z][a-z][a-z][a-z][a-z]' |